Abstract
Flash-memory-based solid-state drives (SSDs) are used widely for secondary storage. To be effective for SSDs, traditional indices have to be redesigned to cope with the special properties of flash memory, such as asymmetric read/write latencies (fast reads and slow writes) and out-of-place updates. Previous flash-optimized indices focus mainly on reducing random writes to SSDs, which is typically accomplished at the expense of a substantial number of extra reads. However, modern SSDs show a narrowing gap between read and write speeds, and read operations on SSDs increasingly affect the overall performance of indices on SSDs. As a consequence, how to optimize SSD-aware indices by reducing both write and read costs is a pertinent and open challenge. We propose a new tree index for SSDs that is able to reduce both writes and extra reads. In particular, we use an update buffer and overflow pages to reduce random writes, and we further exploit Bloom filters to reduce the extra reads to the overflow nodes in the tree. With this mechanism, we construct a read/write-optimized index that is capable of offering better overall performance than previous flash-aware indices. In addition, we present an analysis of the proposed index and show that the read and write costs of the operations on the index can be balanced by only tuning the false-positive rate of the Bloom filters. Our experimental results suggest that our proposal is efficient and represents an improvement over existing methods.
Similar content being viewed by others
Notes
FTL is designed for address translation between block addresses in file systems and physical cells on flash chips.
References
Agrawal, D., Ganesan, D., Sitaraman, R., Diao, Y., Singh, S.: Lazy-adaptive tree: An optimized index structure for flash devices. In: Proceedings of the VLDB Endowment, vol. 2(1), pp. 361–372 (2009)
Ahn, J.S., Kang, D., Jung, D., Kim, J.S., Maeng, S.: \(\mu \)*-Tree: an ordered index structure for nand flash memory with adaptive page layout scheme. IEEE Trans. Comput. 62(4), 784–797 (2013)
Athanassoulis, M., Ailamaki, A.: BF-Tree: approximate tree indexing. In: Proceedings of the VLDB Endowment vol. 7(14), pp. 1881–1892 (2014)
Bayer, R., Schkolnick, M.: Concurrency of operations on B-trees. Acta inform. 9(1), 1–21 (1977)
BenchmarkSQL. http://sourceforge.net/projects/benchmarksql/
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Bonomi, F., Mitzenmacher, M., Panigrahy, R., Singh, S., Varghese, G.: An improved construction for counting Bloom filters. In: ESA, pp. 684–695 (2006)
Canim, M., Mihaila, G.A., Bhattacharjee, B., Lang, C.A., Ross, K.A.: Buffered Bloom filters on solid state storage. In: VLDB Workshop on ADMS, pp. 1–8 (2010)
Debnath, B., Sengupta, S., Li, J., Lilja, D.J., Du, D.H.C.: BloomFlash: Bloom filter on flash-based storage. In: ICDCS, pp. 635–644 (2011)
Dirik, C., Jacob, B.: The performance of PC solid-state disks (SSDs) as a function of bandwidth, concurrency, device architecture, and system organization. In: ISCA, pp. 279–289 (2009)
Fang, H., Yeh, M., Suei, P., Kuo, T.: An adaptive endurance-aware B+-Tree for flash memory storage systems. IEEE Trans. Comput. 63(11), 2661–2673 (2013)
Graefe, G.: A survey of B-tree locking techniques. ACM Trans. Database Syst. 35(3), 16 (2010)
Graefe, G., Halim, F., Idreos, S., Kuno, H., Manegold, S.: Concurrency control for adaptive indexing. In: Proceedings of the VLDB Endowment vol. 5(7), pp. 656–667 (2012)
Graefe, G., Halim, F., Idreos, S., Kuno, H., Manegold, S., Seeger, B.: Transactional support for adaptive indexing. VLDB J. 23(2), 303–328 (2014)
Graefe, G., Kimura, H., Kuno, H.: Foster B-trees. ACM Trans. Database Syst. 37(3), 17 (2012)
InformationWeek 2014 state of enterprise storage survey. http://www.seiservice.com/wp-content/uploads/2014/07/2014StateofEnterpriseStorage.pdf
Jaluta, I., Sippu, S., Soisalon-Soininen, E.: Concurrency control and recovery for balanced B-link trees. VLDB J. 14(2), 257–277 (2005)
Jin, P., Ou, Y., Härder, T., Li, Z.: AD-LRU: an efficient buffer replacement algorithm for flash-based databases. Data Knowl. Eng. 72, 83–102 (2012)
Jin, R., Cho, H.J., Lee, S.W., Chung, T.S.: Lazy-split B+-tree: a novel B+-tree index scheme for flash-based database systems. Des. Autom. Embed. Syst. 17(1), 167–191 (2013)
Kang, D., Jung, D., Kang, J.U., Kim, J.S.: \(\mu \)-Tree: an ordered index structure for NAND flash memory. In: EMSOFT, pp. 144–153 (2007)
Kim, T.: SSD Technology and application trend. http://www.jedec.org/sites/default/files/20101004_FlashSummit_Sanjose_TonyKim.pdf
Lee, S.W., Moon, B., Park, C.: Advances in flash memory SSD technology for enterprise database applications. In: SIGMOD, pp. 863–870 (2009)
Lehman, P.L., et al.: Efficient locking for concurrent operations on B-trees. ACM Trans. Database Syst. 6(4), 650–670 (1981)
Li, Y., He, B., Luo, Q., Yi, K.: Tree indexing on flash disks. In: ICDE, pp. 1303–1306 (2009)
Li, Y., He, B., Yang, R.J., Luo, Q., Yi, K.: Tree indexing on solid state drives. In: Proceedings of the VLDB Endowment vol. 3(1–2), pp. 1195–1206 (2010)
Li, Z., Jin, P., Su, X., Cui, K., Yue, L.: CCF-LRU: a new buffer replacement algorithm for flash memory. IEEE Trans. Consum. Electron. 55(3), 1351–1359 (2009)
Lomet, D., Salzberg, B.: Concurrency and recovery for index trees. VLDB J. 6(3), 224–240 (1997)
Lu, G., Nam, Y.J., Du, D.H.: Bloomstore: Bloom-filter based memory-efficient key-value store for indexing of data deduplication on flash. In: MSST, pp. 1–11 (2012)
Lu, K., Jin, P., Yang, P., Wan, S., Yue, L.: Adaptive in-page logging for flash-memory storage systems. Front. Comput. Sci. 8(1), 131–144 (2014)
Nath, S., Kansal, A.: FlashDB: Dynamic self-tuning database for NAND flash. In: IPSN, pp. 410–419 (2007)
Ou, Y., Härder, T., Jin, P.: CFDC: a flash-aware replacement policy for database buffer management. In: DAMON, pp. 15–20 (2009)
Ouyang, J., Lin, S., Jiang, S., Hou, Z., Wang, Y., Wang, Y.: SDF: Software-defined flash for web-scale internet storage systems. In: ASPLOS, pp. 471–484 (2014)
Roh, H., Kim, W.C., Kim, S., Park, S.: A B-tree index extension to enhance response time and the life cycle of flash memory. Inf. Sci. 179(18), 3136–3161 (2009)
Roh, H., Park, S., Kim, S., Shin, M., Lee, S.W.: B+-tree index optimization by exploiting internal parallelism of flash-based solid state drives. In: Proceedings of the VLDB Endowment vol. 5(4), pp. 286–297 (2011)
Sarwat, M., Mokbel, M.F., Zhou, X., Nath, S.: FAST: a generic framework for flash-aware spatial trees. In: Advances in Spatial and Temporal Databases, pp. 149–167. Springer (2011)
Sarwat, M., Mokbel, M.F., Zhou, X., Nath, S.: Generic and efficient framework for search trees on flash memory storage systems. GeoInformatica 17(3), 417–448 (2013)
SMHasher & MurmurHash. https://code.google.com/p/smhasher/
TPC-C. http://www.tpc.org/tpcc/
Viglas, S.D.: Adapting the B+-tree for asymmetric I/O. In: ADBIS, pp. 399–412 (2012)
Weikum, G., Vossen, G.: Transactional information systems: theory, algorithms, and the practice of concurrency control and recovery. Morgan Kaufmann (2002)
Wu, C.H., Kuo, T.W., Chang, L.P.: An efficient B-tree layer implementation for flash-memory storage systems. ACM Trans. Embed. Comput. Syst. 6(3), 19 (2007)
Yin, S., Pucheral, P.: PBFilter: a flash-based indexing scheme for embedded systems. Inf. Syst. 37(7), 634–653 (2012)
Yin, S., Pucheral, P., Meng, X.: A sequential indexing scheme for flash-based embedded systems. In: EDBT, pp. 588–599 (2009)
Acknowledgments
We would like to thank the editors and anonymous reviewers for their valuable suggestions and comments to improve the quality of the paper. This work is partially supported by the National Science Foundation of China under the Grant (61379037, 61472376) and the Fundamental Research Funds for the Central Universities of China. Christian S. Jensen was supported in part by a grant from the Obel Family Foundation.
Author information
Authors and Affiliations
Corresponding author
Appendix A: transformation from Eqs. 5.8 to 5.9
Appendix A: transformation from Eqs. 5.8 to 5.9
Proof.
Then, we can simplify \(\sum _{x=0}^{s}f^{x BF } \cdot x\) as shown in the following steps.
Next, we can infer that \(\sum _{x=0}^{s}f^{x BF } \cdot 2=2\) through the following transformation.
This gives the following equation.
Thus, we have shown that Eq. 5.9 can be derived from Eq. 5.8.
Rights and permissions
About this article
Cite this article
Jin, P., Yang, C., Jensen, C.S. et al. Read/write-optimized tree indexing for solid-state drives. The VLDB Journal 25, 695–717 (2016). https://doi.org/10.1007/s00778-015-0406-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-015-0406-1