Skip to main content

Advertisement

Understanding and analysis of B+ trees on NVM towards consistency and efficiency

  • Regular Paper
  • Published:
CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Abstract

The emerging non-volatile memory (NVM) possesses DRAM-like performance and disk-like persistency, driving a trend of building single-level storage systems by replacing DRAM and disks. Using NVM as the universal main memory brings opportunities and challenges to the design of new persistent in-memory data structures. In this context, several prior works have designed consistent and persistent B+ trees on NVM. However, All of them evaluate performance of B+ trees by applying an NVM performance simulator and can not provide concrete guidance on how to develop B+ trees with good performance on NVM. In this paper, by using Optane DCs, we aim to study and analyze the influence factors of designing B+ trees on NVM through a series of experiments and provide guidance on how to design efficient B+ trees on NVM. According to our experiments and analysis, we draw several conclusions which are either not presented in prior works, or contrary to current ideas. We discover that the performance of B+ trees is greatly affected by data formats. For example, we analyze the software layer optimizations and hardware layer optimizations separately and find that software layer optimizations do not always improve performance. Furthermore, B+ trees place multiple entries on one node and the shift and balance overhead of FPTree accounts for 39% of the total overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. Hewlett-Packard Development Company. HP Collaborates with Hynix to Bring the Memristor to Market in Next-generation Memory, Aug. 2010. http://www.hp.com/hpinfo/newsroom/press/2010/100831c.html.

  2. https://panthema.net/2007/stx-btree/.

References

  • Ailamaki, A., DeWitt, D.J., Hill, M.D., Wood, D.A.: DBMSS on a modern processor: where does time go? In: Proceedings of 25th International Conference on Very Large Data Bases, pp. 266–377. VLDB, Edinburgh, Scotland, UK (1999)

  • Arbel-Raviv, M., Morrison, A., Trevor, B.: Getting to the root of concurrent binary search tree performance. In: ATC, Boston, MA, USA (2018)

  • Bender, M.A., Farach-Colton, M., Johnson, R., Kraner, R., Kuszmaul, B.C., Medjedovic, D., Montes, P., Shetty, P., Spillane, R.P., Zadok, E.: Don’t thrash: how to cache your hash on flash. PVLDB 5, 1627–1637 (2011)

    Google Scholar 

  • Burr, G.W.: Overview of candidate device technologies for storage-class memory. IBM J. Res. Dev. 52, 449–464 (2008)

    Article  Google Scholar 

  • Caulfield, A.M., De, A., Coburn, J., Mollov, T.I., Gupta, R.K., Swanson, S.M.: Moneta: a high-performance storage array architecture for next-generation, non-volatile memories. In: 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture, pp 385–395 (2010)

  • Caulfield, A.M., Mollov, T.I., Eisner, L.A., De, A., Coburn, J., Swanson, S.M.: Providing safe, user space access to fast, solid state disks. In: ASPLOS, London, UK (2012)

  • Chatzistergiou, A., Cintra, M., Viglas, S.D.: REWIND: recovery write-ahead system for in-memory non-volatile data-structures. In: VLDB Endowment, Hawaii (2015)

  • Chen, S., Gibbons, P.B., Mowry, T.C., Valentin, G.: Fractal prefetching B+-trees. In: ACM SIGMOD International Conference, Wisconsin, USA, p. 157 (2002)

  • Chen, S., Gibbons, P.B., Mowry, T.C.: Improving index performance through prefetching. In: SIGMOD Conference, CA, USA (2001)

  • Chen, S., Gibbons, P.B., Nath, S.: Rethinking database algorithms for phase change memory. In: CIDR, CA, USA (2011)

  • Chen, S., Jin, Q.: Persistent B+-trees in non-volatile main memory. PVLDB 8, 786–797 (2015)

    Google Scholar 

  • Chhugani, J., Nguyen, A.D., Lee, V.W., Macy, W., Hagog, M., Chen, Y.K., Baransi, A., Kumar, S., Dubey, P.: Efficient implementation of sorting on multi-core SIMD CPU architecture. Proc. VLDB Endow. 1(2), 1313–1324 (2008)

    Article  Google Scholar 

  • Chidambaram, V., Pillai, T.S., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Optimistic crash consistency. In: SOSP, Pennsylvania, USA (2013)

  • Coburn, J., Caulfield, A.M., Akel, A., Grupp, L.M., Gupta, R.K., Jhala, R., Swanson, S.M.: NV-heaps: making persistent objects fast and safe with next-generation, non-volatile memories. In: ASPLOS, Georgia, USA (2011)

  • Condit, J.: Better i/o through byte-addressable, persistent memory. In: SOSP, Big Sky, Montana (2009)

  • Corporation, I.: Intel 64 and ia-32 architectures software developer’s manual. http://developer.intel.com/design/pentium4/documentation.htm (2006)

  • Cully, B., Wires, J., Meyer, D.T., Jamieson, K., Fraser, K., Deegan, T., Stodden, D., Lefebvre, G., Ferstay, D., Warfield, A.: Strata: high-performance scalable storage on virtualized non-volatile memory, Santa Clara, CA (2014)

  • Fryer, D., Sun, K., Mahmood, R., Cheng, T., Benjamin, S., Goel, A., Brown, A.D.: Recon: verifying file system consistency at runtime. In: FAST, San Jose, CA (2012)

  • Hammond, L., Wong, V., Chen, M.K., Carlstrom, B.D., Davis, J.D., Hertzberg, B., Prabhu, M.K., Wijaya, H., Kozyrakis, C.E., Olukotun, K.: Transactional memory coherence and consistency. In: Proceedings 31st Annual International Symposium on Computer Architecture, 2004, Munich, Germany, pp. 102–113 (2004)

  • Hankins, R.A., Patel, J.M.: Effect of node size on the performance of cache-conscious B+-trees. In: ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, San Diego, CA, USA, pp. 283–294 (2003)

  • Hwang, D., Kim, W.-H., Nam, B., Won, Y.: Endurable transient inconsistency in byte-addressable persistent B+-tree. In: FAST, Oakland, CA, USA (2018)

  • Kawahara, T.: Scalable spin-transfer torque RAM technology for normally-off computing. IEEE Des. Test Comput. 28, 52–63 (2011)

    Article  Google Scholar 

  • Kim, W.H., Kim, J., Baek, W., Nam, B., Won, Y.: Nvwal: exploiting NVRAM in write-ahead logging. In: International Conference on Architectural Support for Programming Languages and Operating Systems, Atlanta, Georgia, USA, pp. 385–398 (2016)

  • Kim, H., Seshadri, S., Dickey, C., Chiu, L.: Evaluating phase change memory for enterprise storage systems: a study of caching and tiering approaches. In: FAST, Santa Clara, CA (2014)

  • Lee, B.C., Ipek, E., Mutlu, O., Burger, D.: Architecting phase change memory as a scalable dram alternative. In: ISCA, Auckland, New Zealand (2009)

  • Lehman, T.J., Carey, M.J.: A study of index structures for main memory database management systems. In: Proceedings of VLDB, vol. 1, Kyoto, Japan (1986)

  • Li, C., Shilane, P., Douglis, F., Shim, H., Smaldone, S., Wallace, G.: Nitro: A capacity-optimized SSD cache for primary storage. In: USENIX Annual Technical Conference, Ayodya Resort, Bali, Indonesia (2014)

  • Li, Y., He, B., Yang, J., Luo, Q., Yi, K.: Tree indexing on solid state drives. PVLDB 3, 1195–1206 (2010)

    Google Scholar 

  • Liu, M., Zhang, M., Chen, K., Qian, X., Wu, Y., Zheng, W., Ren, J.: DUDETM: building durable transactions with decoupling for persistent memory. In: International Conference, Xi'an, China, pp. 329–343 (2017)

  • Mandelman, J.A.: Challenges and future directions for the scaling of dynamic random-access memory (DRAM). IBM J. Res. Dev. 46(2.3), 187–212 (2002)

    Article  Google Scholar 

  • Narayanan, D., Hodson, O.: Whole-system persistence. In: ASPLOS, London, UK (2012)

  • Oukid, I., Lasperas, J., Nica, A., Willhalm, T., Lehner, W.: Fptree: A hybrid SCM-DRAM persistent and concurrent b-tree for storage class memory. In: SIGMOD Conference, San Francisco, USA (2016)

  • Pillai, T.S., Chidambaram, V., Alagappan, R., Al-Kiswany, S., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: All file systems are not created equal: On the complexity of crafting crash-consistent applications. In: OSDI, Farmington, Pennsylvania, USA (2014)

  • Qin, D., Brown, A.D., Goel, A.: Reliable writeback for client-side flash caches. In: USENIX Annual Technical Conference, Ayodya Resort, Bali, Indonesia (2014)

  • Qureshi, M.K., Srinivasan, V., Rivers, J.A.: Scalable high performance main memory system using phase-change memory technology. In: ISCA, Austin, Texas, USA (2009)

  • Rao, J., Ross, K.A.: Cache conscious indexing for decision-support in main memory. In: VLDB, Edinburgh, Scotland (1999)

  • Rao, J., Ross, K.A.: Making B+-trees cache conscious in main memory. In: SIGMOD Conference, Dallas, Texas, USA (2000)

  • Raoux, S.: Phase-change random access memory: a scalable technology. IBM J. Res. Dev. 52, 465–480 (2008)

    Article  Google Scholar 

  • Satish, N., Kim, C., Chhugani, J., Nguyen, A.D., Lee, V.W., Kim, D., Dubey, P.: Fast sort on CPUS and GPUS:a case for bandwidth oblivious SIMD sort. In: ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June, pp. 351–362 (2010)

  • Seshadri, S., Gahagan, M., Bhaskaran, M.S., Bunker, T., De, A., Jin, Y., Liu, Y., Swanson, S.M.: Willow: a user-programmable SSD. In: OSDI, Santa Clara, CA, USA (2014)

  • Subramanian, S., Sundararaman, S., Talagala, N., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Snapshots in a flash with IOSNAP. In: EuroSys, Amsterdam, The Netherlands (2014)

  • Venkataraman, S., Tolia, N.H., Ranganathan, P., Campbell, R.H.: Consistent and durable data structures for non-volatile byte-addressable memory. In: FAST, Leuven, Belgium (2011)

  • Volos, H., Magalhaes, G., Cherkasova, L., Li, J.: Quartz: a lightweight performance emulator for persistent memory software. In: Middleware, Vancouver, BC, Canada (2015)

  • Volos, H., Tack, A.J., Swift, M.M.: Mnemosyne: lightweight persistent memory. In: ASPLOS, Newport Beach, CA, USA (2011)

  • Wang, C., Vazhkudai, S.S., Ma, X., Meng, F., Kim, Y., Engelmann, C.: NVMALLOC: exposing an aggregate SSD store as a memory partition in extreme-scale machines. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium, Shanghai, China, pp. 957–968 (2012)

  • Wu, X., Reddy, A.L.N.: SCMFS: a file system for storage class memory. In: 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), S. Margherita di Pula, Sardinia, Italy, pp. 1–11 (2011)

  • Yang, J., Wei, Q., Chen, C., Wang, C., Yong, K.L., He, B.: NV-tree: reducing consistency cost for nvm-based single level systems. In: FAST, Santa Clara, CA, USA (2015)

  • Zhou, P.: A durable and energy efficient main memory using phase change memory technology. In: ISCA, Austin, Texas, USA (2009)

Download references

Funding

This work is supported by National Key Research & Development Program of China (Grant No. 2018YFB1003301), the National Natural Science Foundation of China (Grant No. 61832011), and Huawei Innovation Research Program (Grant No. 20202000097). HE's research at Temple University is partially supported by the U.S. NSF grant CCF-1717660.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiangkun Hu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, J., Chen, Y., Lu, Y. et al. Understanding and analysis of B+ trees on NVM towards consistency and efficiency. CCF Trans. HPC 2, 36–49 (2020). https://doi.org/10.1007/s42514-020-00022-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42514-020-00022-z

Keywords