Skip to main content
Log in

Bulk-loading and bulk-insertion algorithms for \(\hbox {xBR}^{+}\hbox {-trees}\) in Solid State Drives

  • Published:
Computing Aims and scope Submit manuscript

Abstract

Spatial indexes are important in spatial databases for efficient execution of queries involving spatial constraints. The \(\hbox {xBR}^{+}\hbox {-tree}\) is a balanced, disk-resident, Quadtree-based index for point data, which is very efficient for processing spatial queries. Bulk-loading refers to the process of creating an index from scratch as a whole, when the dataset to be indexed is available beforehand, instead of creating (loading) the index gradually, when the dataset items are available one-by-one. Bulk insertion refers to the process of updating an existing index by inserting a large batch of new data, treating the items of this batch as a whole and not by inserting these items one-by-one. In this paper, we modify previous bulk-loading and bulk-insertion algorithms for \(\hbox {xBR}^{+}\hbox {-trees}\) to achieve higher performance by taking advantage of the special features of Solid State Drives (SSDs). SSDs have attracted database developers, mainly due to their higher read performance (thanks to their internal parallelism) than Hard Disk Drives. Using real and artificial datasets of various cardinalities, we experimentally compare the modified algorithms against their predecessors and show that the modified algorithms are clear winners regarding performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Retrieved from http://spatialhadoop.cs.umn.edu/datasets.html.

References

  1. Achakeev D, Seeger B, Widmayer P (2012) Sort-based query-adaptive loading of R-trees. In: CIKM conference, pp 2080–2084

  2. Achakeev D, Seidemann M, Schmidt M, Seeger B (2012) Sort-based parallel loading of R-trees. In: BigSpatial workshop, pp 62–70

  3. An N, Kanth KVR, Ravada S (2003) Improving performance with bulk-inserts in Oracle R-trees. In: VLDB conference, pp 948–951

    Chapter  Google Scholar 

  4. Arge L, Hinrichs KH, Vahrenhold J, Vitter JS (1999) Efficient bulk operations on dynamic R-trees. In: ALENEX workshop, pp 328–348

    Chapter  Google Scholar 

  5. Arge L, Hinrichs KH, Vahrenhold J, Vitter JS (2002) Efficient bulk operations on dynamic R-trees. Algorithmica 33(1):104–128

    Article  MathSciNet  Google Scholar 

  6. Berchtold S, Böhm C, Kriegel H (1998) Improving the query performance of high-dimensional index structures by bulk-load operations. In: EDBT conference, pp 216–230

    Google Scholar 

  7. Carniel AC, Ciferri RR, de Aguiar Ciferri CD (2017) A generic and efficient framework for spatial indexing on flash-based solid state drives. In: ADBIS conference, pp 229–243

    Chapter  Google Scholar 

  8. Chen L, Choubey R, Rundensteiner EA (1998) Bulk-insertions info R-trees using the small-tree-large-tree approach. In: ACM-GIS conference, pp 161–162

  9. Chen L, Choubey R, Rundensteiner EA (2002) Merging R-trees: efficient strategies for local bulk insertion. GeoInformatica 6(1):7–34

    Article  Google Scholar 

  10. Cho S, Chang S, Jo I (2015) The solid-state drive technology, today and tomorrow. In: ICDE conference, pp 1520–1522

  11. Choubey R, Chen L, Rundensteiner EA (1999) GBI: A generalized R-tree bulk-insertion strategy. In: SSD conference, pp 91–108

    Chapter  Google Scholar 

  12. Ciaccia P, Patella M (1998) Bulk loading the M-tree. In: ADC conference, pp 15–26

  13. Cornwell M (2012) Anatomy of a solid-state drive. Commun ACM 55(12):59–63

    Article  Google Scholar 

  14. den Bercken JV, Seeger B (2001) An evaluation of generic bulk loading techniques. In: VLDB conference, pp 461–470

  15. den Bercken JV, Seeger B, Widmayer P (1997) A generic approach to bulk loading multidimensional index structures. In: VLDB conference, pp 406–415

  16. Emrich T, Graf F, Kriegel H, Schubert M, Thoma M (2010) On the impact of flash SSDs on spatial indexing. In: DaMoN conference, pp 3–8

  17. Fevgas A, Bozanis P (2015) Grid-file: towards to a flash efficient multi-dimensional index. In: DEXA conference, pp 285–294

    Google Scholar 

  18. Ghanem TM, Shah R, Mokbel MF, Aref WG, Vitter JS (2004) Bulk operations for space-partitioning trees. In: ICDE conference, pp 29–40

  19. Hady FT, Foong AP, Veal B, Williams D (2017) Platform storage performance with 3d XPoint technology. Proc IEEE 105(9):1822–1833

    Article  Google Scholar 

  20. Hjaltason GR, Samet H (1999) Improved bulk-loading algorithms for quadtrees. In: ACM-GIS conference, pp 110–115

  21. Hjaltason GR, Samet H (2002) Speeding up construction of PMR quadtree-based spatial indexes. VLDB J 11(2):109–137

    Article  Google Scholar 

  22. Hjaltason GR, Samet H, Sussmann YJ (1997) Speeding up bulk-loading of quadtrees. In: ACM-GIS conference, pp 50–53

  23. Jin P, Ou Y, Harder T, Li Z (2012) AD-lRU: an efficient buffer replacement algorithm for flash-based databases. Data Knowl Eng 72:83–102

    Article  Google Scholar 

  24. Jin P, Xie X, Wang N, Yue L (2015) Optimizing R-tree for flash memory. Expert Syst Appl 42(10):4676–4686

    Article  Google Scholar 

  25. Kamel I, Faloutsos C (1993) On packing R-trees. In: CIKM conference, pp 490–499

  26. Kamel I, Khalil M, Kouramajian V (1996) Bulk insertion in dynamic R-trees. In: SDH conference, pp 3B.31–3B.42

  27. Koltsidas I, Viglas SD (2011) Spatial data management over flash memory. In: SSTD conference, pp 449–453

    Chapter  Google Scholar 

  28. Lee T, Moon B, Lee S (2006) Bulk insertion for R-trees by seeded clustering. Data Knowl Eng 59(1):86–106

    Article  MathSciNet  Google Scholar 

  29. Leutenegger ST, Edgington JM, Lopez MA (1997) STR: a simple and efficient algorithm for R-tree packing. In: ICDE conference, pp 497–506

  30. Li G, Zhao P, Yuan L, Gao S (2013) Efficient implementation of a multi-dimensional index structure over flash memory storage systems. J Supercomput 64(3):1055–1074

    Article  Google Scholar 

  31. Lv Y, Li J, Cui B, Chen X (2011) Log-compact R-tree: an efficient spatial index for SSD. In: DASFAA workshops, pp 202–213

    Chapter  Google Scholar 

  32. Papadopoulos A, Manolopoulos Y (2003) Parallel bulk-loading of spatial data. Parallel Comput 29(10):1419–1444

    Article  MathSciNet  Google Scholar 

  33. Park S, Jung D, Kang J, Kim J, Lee J (2006) CFLRU: a replacement algorithm for flash memory. In: Proceedings of the 2006 international conference on compilers, architecture and synthesis for embedded systems. ACM, pp 234–241

  34. Pawlik M, Macyna W (2012) Implementation of the aggregated R-tree over flash memory. In: DASFAA workshops, pp 65–72

    Chapter  Google Scholar 

  35. Roh H, Park S, Kim S, Shin M, Lee S (2011) B\(^{+}\)-tree index optimization by exploiting internal parallelism of flash-based solid state drives. PVLDB 5(4):286–297

    Google Scholar 

  36. Roumelis G, Vassilakopoulos M, Corral A (2011) Performance comparison of xBR-trees and R*-trees for single dataset spatial queries. In: ADBIS conference, pp 228–242

    Chapter  Google Scholar 

  37. Roumelis G, Vassilakopoulos M, Corral A, Manolopoulos Y (2016) Bulk-loading xBR\(^+\)-trees. In: MEDI conference, pp 57–71

  38. Roumelis G, Vassilakopoulos M, Corral A, Manolopoulos Y (2017) Bulk insertions into xBR\(^+\)-trees. In: MEDI conference, pp 185–199

  39. Roumelis G, Vassilakopoulos M, Corral A, Manolopoulos Y (2017) Efficient query processing on large spatial databases: a performance study. J Syst Softw 132:165–185

    Article  Google Scholar 

  40. Roumelis G, Vassilakopoulos M, Corral A, Manolopoulos Y (2018) An efficient algorithm for bulk-loading xBR\(^+\)-trees. Comput Stand Interfaces 57:83–100

    Article  Google Scholar 

  41. Roumelis G, Vassilakopoulos M, Loukopoulos T, Corral A, Manolopoulos Y (2015) The xBR\(^+\)-tree: an efficient access method for points. In: DEXA conference, pp 43–58

    Google Scholar 

  42. Roussopoulos N, Kotidis Y, Roussopoulos M (1997) Cubetree: Organization of and bulk updates on the data cube. In: SIGMOD conference, pp 89–99

    Article  Google Scholar 

  43. Roussopoulos N, Leifker D (1985) Direct spatial search on pictorial databases using packed R-trees. In: SIGMOD conference, pp 17–31

    Article  Google Scholar 

  44. Sarwat M, Mokbel MF, Zhou X, Nath S (2011) FAST: a generic framework for flash-aware spatial trees. In: SSTD conference, pp 149–167

    Chapter  Google Scholar 

  45. Shekhar S, Chawla S (2003) Spatial databases—a tour. Prentice Hall, Upper Saddle River

    Google Scholar 

  46. Vassilakopoulos M, Manolopoulos Y (2000) External balanced regular (x-BR) trees: new structures for very large spatial databases. In: Fotiadis DI, Nikolopoulos SD (eds) Advances in informatics: selected papers of the 7th hellenic conference on informatics (HCI ’99). World Scientific, Singapore, pp 324–333. https://doi.org/10.1142/9789812793928_0029

  47. Vespa TG, Traina C Jr, Traina AJM (2010) Efficient bulk-loading ondynamic metric access methods. Inf Syst 35(5):557–569

    Article  Google Scholar 

  48. Wu C, Chang L, Kuo T (2003) An efficient R-tree implementation over flash-memory storage systems. In: ACM-GIS conference, pp 17–24

  49. Yang C, Jin P, Yue L, Yang P (2016) Efficient buffer management for tree indexes on solid state drives. Int J Parallel Program 44(1):5–25

    Article  Google Scholar 

Download references

Acknowledgements

Work of M. Vassilakopoulos, A. Corral and Y. Manolopoulos funded by the MINECO research projects [TIN2013-41576-R] and [TIN2017-83964-R].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Vassilakopoulos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Roumelis, G., Fevgas, A., Vassilakopoulos, M. et al. Bulk-loading and bulk-insertion algorithms for \(\hbox {xBR}^{+}\hbox {-trees}\) in Solid State Drives. Computing 101, 1539–1563 (2019). https://doi.org/10.1007/s00607-019-00709-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00607-019-00709-4

Keywords

Navigation