Abstract
Tree index structures are crucial components in data management systems. Existing tree index structure are designed with the implicit assumption that the underlying external memory storage is the conventional magnetic hard disk drives. This assumption is going to be invalid soon, as flash memory storage is increasingly adopted as the main storage media in mobile devices, digital cameras, embedded sensors, and notebooks. Though it is direct and simple to port existing tree index structures on the flash memory storage, that direct approach does not consider the unique characteristics of flash memory, i.e., slow write operations, and erase-before-update property, which would result in a sub optimal performance. In this paper, we introduce FAST (i.e., Flash-Aware Search Trees) as a generic framework for flash-aware tree index structures. FAST distinguishes itself from all previous attempts of flash memory indexing in two aspects: (1) FAST is a generic framework that can be applied to a wide class of data partitioning tree structures including R-tree and its variants, and (2) FAST achieves both efficiency and durability of read and write flash operations through memory flushing and crash recovery techniques. Extensive experimental results, based on an actual implementation of FAST inside the GiST index structure in PostgreSQL, show that FAST achieves better performance than its competitors.
Similar content being viewed by others
Notes
In a typical flash memory, the cost of read, write, and erase operations are 25, 200 and 1,500 μs, respectively [3].
References
PostgreSQL. http://www.postgresql.org
Agrawal D, Ganesan D, Sitaraman RK, Diao Y, Singh S (2009) Lazy-adaptive tree: an optimized index structure for flash devices. PVLDB
Agrawal N, Prabhakaran V, Wobber T, Davis J, Manasse M, Panigrahy R (2008) Design tradeoffs for SSD performance. In: Usenix annual technical conference, USENIX
Bayer R, McCreight EM (1972) Organization and maintenance of large ordered indices. Acta Inform 1:173–189
Beckmann N, Kriegel H-P, Schneider R, Seeger B (1990) The R*-tree: an efficient and robust access method for points and rectangles. In: SIGMOD
Birrell A, Isard M, Thacker C, Wobber T (2007) A design for high-performance flash disks. ACM SIGOPS Oper Syst Rev 41(2):88–93
Bouganim L, Jónsson B, Bonnet P (2009) uFLIP: understanding flash IO patterns. In: CIDR
Chang Y-H, Hsieh J-W, Kuo T-W (2007) Endurance enhancement of flash-memory storage systems: an efficient static wear leveling design. In: Proceedings of the annual ACM IEEE Design Automation Conference, DAC, pp 212–217
Chen S (2009) FlashLogging: exploiting flash devices for synchronous logging performance. In: SIGMOD. New York, NY
Comer D (1979) The ubiquitous B-tree. ACM Comput Surv 11(2):121–137
Gray J (2006) Tape is dead, disk is tape, flash is disk, RAM locality is king. http://research.microsoft.com/~gray/talks/Flash_is_Good.ppt. Accessed Dec 2006
Gray J, Fitzgerald B (2008) Flash disk opportunity for server applications. ACM Queue 6(4):18–23
Gray J, Graefe G (1997) The five-minute rule ten years later, and other computer storage rules of thumb. SIGMOD Rec 26(4):63–68
Guttman A (1984) R-trees: a dynamic index structure for spatial searching. In: SIGMOD
Hellerstein JM, Naughton JF, Pfeffer A (1995) Generalized search trees for database systems. In: VLDB
Hutsell W (2007) Solid state storage for the enterprise. Storage Networking Industry Association (SNIA) Tutorial, Fall
Katayama N, Satoh, S (1997) The sr-tree: an index structure for high-dimensional nearest neighbor queries. In: SIGMOD
Kim H, Ahn S (2008) BPLRU: a buffer management scheme for improving random writes in flash storage. In: FAST
Lavenier D, Xinchun X, Georges G (2006) seed-based genomic sequence comparison using a FPGA/FLASH accelerator. In: ICFPT
Lee S, Moon B (2007) Design of flash-based DBMS: an in-page logging approach. In: SIGMOD
Lee S-W, Moon B, Park C, Kim J-M, Kim S-W (2008) A case for flash memory SSD in enterprise database applications. In: SIGMOD
Lee S-W, Park D-J, sum Chung T, Lee D-H, Park S, Song H-J (2007) A log buffer-based flash translation layer using fully-associate sector translation. TECS
Leventhal A (2008) Flash storage today. ACM Queue 6(4):24–30
Li Y, He B, Luo Q, Yi K (2009) Tree indexing on flash disks. In: ICDE
Li Y, He B, Yang RJ, Luo Q, Yi K (2010) Tree indexing on solid state drives. Proceedings of the VLDB Endowment 3(1–2):1195–1206
Ma D, Feng J, Li G (2011) LazyFTL: A page-level flash translation layer optimized for NAND flash memory. In: SIGMOD
McCreight EM (1977) Pagination of B*-trees with variable-length records. Commun ACM 20(9):670–674
Moshayedi M, Wilkison P (2008) Enterprise SSDs. ACM Queue 6(4):32–39
Nath S, Gibbons PB (2008) Online maintenance of very large random samples on flash storage. In: VLDB
Nath S, Kansal A (2007) Flashdb: dynamic self-tuning database for NAND flash. In: IPSN
Reinsel D, Janukowicz J (2008) Datacenter SSDs: solid footing for growth. http://www.samsung.com/us/business/semiconductor/news/downloads/210290.pdf. Accessed Jan 2008
Sellis TK, Roussopoulos N, Faloutsos C (1987) The R+-tree: a dynamic index for multi-dimensional objects. In: VLDB
Shah MA, Harizopoulos S, Wiener JL, Graefe G (2008) Fast scans and joins using flash drives. In: International Workshop of Data Managment on New Hardware, DaMoN
White DA, Jain R (1996) Similarity indexing with the SS-tree. In: ICDE
Wu C, Chang L, Kuo T (2003) An efficient R-tree implementation over flash-memory storage systems. In: GIS
Wu C, Kuo T, Chang L (2007) An efficient B-tree layer implementation for flash-memory storage systems. TECS
Author information
Authors and Affiliations
Corresponding author
Additional information
The research of M. Sarwat and M. F. Mokbel is supported in part by the National Science Foundation under Grants IIS-0811998, IIS-0811935, CNS-0708604, IIS-0952977, by a Microsoft Research Gift, and by a seed grant from UMN DTC.
Rights and permissions
About this article
Cite this article
Sarwat, M., Mokbel, M.F., Zhou, X. et al. Generic and efficient framework for search trees on flash memory storage systems. Geoinformatica 17, 417–448 (2013). https://doi.org/10.1007/s10707-012-0164-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10707-012-0164-9