Abstract
Indexing data on flash-based Solid State Drives (SSDs) is an important paradigm recently applied in spatial data management. During last years, the design of new spatial access methods for SSDs, named flash-aware spatial indices, has attracted the attention of many researchers, mainly to exploit the advantages of SSDs in spatial query processing. eFIND is a generic framework for transforming a disk-based spatial index into a flash-aware one, taking into account the intrinsic characteristics of SSDs. In this article, we present a systematic approach for porting disk-based data-driven and space-driven access methods to SSDs, through the eFIND framework. We also present the actual porting of representatives data-driven (R-trees, R*-trees, and Hilbert R-trees) and space-driven (xBR+-trees) access methods through this framework. Moreover, we present an extensive experimental evaluation that compares the performance of these ported indices when inserting and querying synthetic and real point datasets. The main conclusions of this experimental study are that the eFIND R-tree excels in insertions, the eFIND xBR+-tree is the fastest for different types of spatial queries, and the eFIND Hilbert R-tree is efficient for processing intersection range queries.
Similar content being viewed by others
References
Gaede V, Günther O (1998) Multidimensional access methods. ACM Comp Surveys 30(2):170–231
Rigaux P, Scholl M, Voisard A (2001) Spatial databases: with application to GIS, 1st edn. Morgan Kaufmann, Burlington
Oosterom PVaN (2005) Spatial Access Methods. In: Longley PA, Goodchild MF, Maguire DJ, Rhind DW (eds) Geographical Information Systems: Principles, Techniques, Management and Applications. 2nd edn., pp 385–400
Guttman A (1984) R-trees: A dynamic index structure for spatial searching. In: ACM SIGMOD Int. Conf. on Management of Data, pp 47–57
Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R*-tree: An efficient and robust access method for points and rectangles. In: ACM SIGMOD Int. Conf. on Management of Data, pp 322–331
Kamel I, Faloutsos C (1994) Hilbert R-tree: An improved R-tree using fractals. In: In: Int. Conf. on Very Large Databases, pp 500–509
Samet H (1984) The quadtree and related hierarchical data structures. ACM Comp Surveys 16(2):187–260
Roumelis G, Vassilakopoulos M, Loukopoulos T, Corral A, Manolopoulos Y (2015) The xBR+-tree: an efficient access method for points. In: Int. Conf. on Database and Expert Systems Applications, pp 43–58
Brayner Ax, Monteiro Filho JM (2016) Hardware-aware database systems: A new era for database technology is coming - vision paper. In: Brazilian Symp. on Databases, pp 187–192
Mittal S, Vetter JS (2016) A survey of software techniques for using non-volatile memories for storage and main memory systems. IEEE Trans Parallel Distrib Syst 27(5):1537–1550
Fevgas A, Akritidis L, Bozanis P, Manolopoulos Y (2019) Indexing in flash storage devices: a survey on challenges, current approaches, and future trends. VLDB J:1–39
Emrich T, Graf F, Kriegel H-P, Schubert M, Thoma M (2010) On the impact of flash SSDs on spatial indexing. In: Int. Workshop on Data Management on New Hardware, pp 3–8
Koltsidas I, Viglas SD (2011) Spatial data management over flash memory. In: Int. Conf. on Advances in Spatial and Temporal Databases, pp 449–453
Carniel AC, Ciferri RR, Ciferri CDA (2016) The performance relation of spatial indexing on hard disk drives and solid state drives. In: Brazilian Symp. on GeoInformatics, pp 263–274
Carniel AC, Ciferri RR, Ciferri CDA (2017) Analyzing the performance of spatial indices on hard disk drives and flash-based solid state drives. J Inf Data Manag 8(1):34–49
Agrawal N, Prabhakaran V, Wobber T, Davis JD, Manasse M, Panigrahy R (2008) Design tradeoffs for SSD performance. In: USENIX 2008 Annual Technical Conf., pp 57–70
Bouganim L, Jónsson B, Bonnet P (2009) uFLIP: Understanding flash IO patterns. In: Fourth biennial conf. on innovative data systems research
Chen F, Koufaty DA, Zhang X (2009) Understanding intrinsic characteristics and system implications of flash memory based solid state drives. In: ACM SIGMETRICS Int. Conf. on Measurement and Modeling of Computer Systems, pp 181–192
Jung M, Kandemir M (2013) Revisiting widely held SSD expectations and rethinking system-level implications. In: ACM SIGMETRICS Int. Conf. on Measurement and Modeling of Computer Systems, pp 203–216
Chen F, Hou B, Lee R (2016) Internal parallelism of flash memory-based solid-state drives. ACM Trans Storage 12(3):13:1–13:39
Carniel AC, Ciferri RR, Ciferri CDA (2017) A generic and efficient framework for spatial indexing on flash-based solid state drives. In: Inf Syst, pp 229–243
Carniel AC, Ciferri RR, Ciferri CDA (2019) A generic and efficient framework for flash-aware spatial indexing. Inf Syst 82:102–120
Hellerstein JM, Naughton JF, Pfeffer A (1995) Generalized search trees for database systems. In: Int. Conf. on Very Large Databases, pp 562–573
Kornacker M (1999) High-performance extensible indexing. In: Int. Conf. on Very Large Databases, pp 699–708
Aref WG, Ilyas IF (2001) SP-GiST: An extensible database index for supporting space partitioning trees. J Intell Inf Syst 17:215–240
Cormer D (1979) Ubiquitous B-tree. IEEE Trans Softw Eng 11 (2):121–137
Agrawal D, Ganesan D, Sitaraman R, Diao Y, Singh S (2009) Lazy-adaptive tree: An optimized index structure for flash devices. VLDB Endow 2(1):361–372
Wu C-H, Kuo T-W, Chang L-P (2007) An efficient B-tree layer implementation for flash-memory storage systems. ACM Trans Embedded Comput Syst 6(3)
Kwon SJ, Ranjitkar A, Ko Y-B, Chung T-S (2011) FTL algorithms for NAND-type flash memories. Des Autom Embedded Syst 15(3-4):191–224
Li Y, He B, Yang RJ, Luo Q, Yi K (2010) Tree indexing on solid state drives. VLDB Endow 3(1-2):1195–1206
Thonangi R, Babu S, Yang J (2012) A practical concurrent index for solid-state drives. In: ACM Int. Conf. on Information and Knowledge Management, pp 1332–1341
Jin P, Yang C, Jensen CS, Yang P, Yue L (2016) Read/write-optimized tree indexing for solid-state drives. VLDB J 25(5):695–717
Wu C-H, Chang L-P, Kuo T-W (2003) An efficient R-tree implementation over flash-memory storage systems. In: ACM SIGSPATIAL Int. Conf. on Advances in Geographic Information Systems, pp 17–24
Lin S, Zeinalipour-Yazti D, Kalogeraki V, Gunopulos D, Najjar WA (2006) Efficient indexing data structures for flash-based sensor devices. ACM Trans Storage 2(4):468–503
Nievergelt J, Hinterberger H, Sevcik KC (1984) The grid file: An adaptable, symmetric multikey file structure. ACM Trans Database Syst 9(1):38–71
Lv Y, Li J, Cui B, Chen X (2011) Log-Compact R-tree: An efficient spatial index for SSD. In: Int. Conf. on Database Systems for Advanced Applications, pp 202–213
Li G, Zhao P, Yuan L, Gao S (2013) Efficient implementation of a multi-dimensional index structure over flash memory storage systems. J Supercomput 64(3):1055–1074
Robinson JT (1981) The K-D-B-tree: a search structure for large multidimensional dynamic indexes. In: ACM SIGMOD Int. Conf. on Management of Data, pp 10–18
Jin P, Xie X, Wang N, Yue L (2015) Optimizing R-tree for flash memory. Expert Syst Appl 42(10):4676–4686
Fevgas A, Bozanis P (2015) Grid-file: Towards to a flash efficient multi-dimensional index. In: Int. Conf. on Database and Expert Systems Applications, pp 285–294
Fevgas A, Bozanis P (2019) LB-Grid: An SSD efficient grid file. Data Knowl Eng 121:18–41
Denning PJ (1980) Working sets past and present. TSE SE-6 (1):64–84
Roumelis G, Vassilakopoulos M, Corral A, Fevgas A, Manolopoulos Y (2018) Spatial batch-queries processing using xbr+-trees in solid-state drives. In: Int. Conf. on Model and Data Engineering, pp 301–317
Roumelis G, Fevgas A, Vassilakopoulos M, Corral A, Bozanis P, Manolopoulos Y (2019) Bulk-loading and bulk-insertion algorithms for xBR+-trees in solid state drives. Computing:1–25
Carniel AC, Roumelis G, Ciferri RR, Vassilakopoulos M, Corral A, Ciferri CDA (2018) An efficient flash-aware spatial index for points. In: Brazilian Symp. on GeoInformatics, pp 68–79
Carniel AC, Roumelis G, Ciferri RR, Vassilakopoulos M, Corral A, Ciferri CDA (2019) Indexing points on flash-based solid state drives using the xBR+-tree. J Inf Data Manag 10(1):35–48
Sarwat M, Mokbel MF, Zhou X, Nath S (2013) Generic and efficient framework for search trees on flash memory storage systems. GeoInformatica 17 (3):417–448
Jenkins B (2006) Hash functions for hash table lookup. http://burtleburtle.net/bob/hash/index.html
Effelsberg W, Haerder T (1984) Principles of database buffer management. ACM Trans on Database Systems 9(4):560–595
Johnson T, Shasha D (1994) 2Q: A low overhead high performance buffer management replacement algorithm. In: Int. Conf. on Very Large Databases, pp 439–450
Graefe G (2012) A survey of b-tree logging and recovery techniques. ACM Trans Database Syst 37(1)
Proietti G, Faloutsos C (1999) I/O complexity for range queries on region data stored using an r-tree. In: Int. Conf. on Data Engineering, pp 628–635
Arge L, De Berg M, Haverkort H, Yi Ke (2008) The Priority R-tree: A practically efficient and worst-case optimal r-tree. ACM Trans Algorithms 4(1)
Mehlhorn K, Sanders P (2008) Algorithms and data structures: The basic toolbox. Springer
Roumelis G, Vassilakopoulos M, Corral A, Manolopoulos Y (2017) Efficient query processing on large spatial databases: A performance study. J Syst Softw 132:165–185
Folk MJ, Zoellick B, Riccardi G (1997) File structures: An object-oriented approach with C++, 3rd edn. Addison Wesley, Boston
PostGIS (2020) Spatial and geographic objects for postgresql. https://postgis.net/
Carniel AC, Ciferri RR, Ciferri CDA (2017) Spatial datasets for conducting experimental evaluations of spatial indices. In: Satellite events of the brazilian symp. on databases - dataset showcase workshop, pp 286–295
Carniel AC, Silva TB, Bonicenha KLS, Ciferri RR, Ciferri CDA (2017) Analyzing the performance of spatial indices on flash memories using a flash simulator. In: Brazilian Symp. on Databases, pp 40–51
Carniel AC, Ciferri RR, Ciferri CDA (2020) FESTIval: A versatile framework for conducting experimental evaluations of spatial indices. MethodsX 7:1–19
Sellis T, Roussopoulos N, Faloutsos C (1987) The R+-tree: A dynamic index for multi-dimensional objects. In: Int. Conf. on Very Large Databases, pp 507–518
Berchtold S, Keim DA, Kriegel H-P (1996) The X-Tree: An index structure for high-dimensional data. In: In: Int. Conf. on Very Large Databases, pp 28–39
Harder T, Reuter A (1993) Principles of transaction-oriented database recovery. ACM Comp Surv 15(4):287–317
Su X, Jin P, Xiang X, Cui K, Yue L (2009) Flash-DBSim: A simulation tool for evaluating flash-based database algorithms. In: IEEE Int. Conf. on Computer Science and Information Technology, pp 185–189
Zhang Y, Swanson S (2015) A study of application performance with non-volatile main memory. In: Symp. on Mass Storage Systems and Technologies, pp 1–10
Acknowledgements
This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. This work has also been supported by CNPq and by the São Paulo Research Foundation (FAPESP). Cristina D. Aguiar has been supported by the grant #2018/22277-8, FAPESP. The work of Michael Vassilakopoulos and Antonio Corral is funded by the MINECO research project [TIN2017-83964-R].
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Anderson Chaves Carniel has initiated this work at the Federal University of Technology -Paraná, Dois Vizinhos, PR 85660-000, Brazil
Rights and permissions
About this article
Cite this article
Carniel, A.C., Roumelis, G., Ciferri, R.R. et al. Porting disk-based spatial index structures to flash-based solid state drives. Geoinformatica 26, 253–298 (2022). https://doi.org/10.1007/s10707-021-00455-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-021-00455-w