Skip to main content
Log in

Dalea: A Persistent Multi-Level Extendible Hashing with Improved Tail Performance

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Persistent memory (PM) promises byte-addressability, large capacity, and durability. Main memory systems, such as key-value stores and in-memory databases, benefit from such features of PM. Due to the great popularity of hashing index in main memory systems, a number of research efforts are made to provide high average performance persistent hashing. However, suboptimal tail performance in terms of tail throughput and tail latency is still observed for existing persistent hashing. In this paper, we analyze major sources of suboptimal tail performance from key design issues of persistent hashing. We identify the global hash structure and concurrency control as remaining explorable design spaces for improving tail performance. We propose Directory-sharing Multi-level Extendible Hashing (Dalea) for PM. Dalea designs ancestor link-based extendible hashing as well as fine-grained transient lock to address the two main sources (rehashing and locking) affecting tail performance. The evaluation results show that, compared with state-of-the-art persistent hashing Dash, Dalea achieves increased tail throughput by 4.1x and reduced tail latency by 5.4x. Moreover, in order to provide design guidelines for improving tail performance, we adopt Dalea as a testbed to identify different impacts of four factors on tail performance, including fine-grained rehashing, transient locking, memory pre-allocation, and fingerprinting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

References

  1. Burr G W, Breitwisch M J, Franceschini M, Garetto D, Gopalakrishnan K, Jackson B, Kurdi B, Lam C, Lastras L A, Padilla A, Rajendran B, Raoux S, Shenoy R S. Phase change memory technology. Journal of Vacuum Science & Technology B, 2010, 28(2): 223–262. DOI: https://doi.org/10.1116/1.3301579.

    Article  Google Scholar 

  2. Ohno H, Endoh T, Hanyu T, Ando Y, Ikeda S. 15-spintransfer-torque magnetoresistive random access memory (STT-MRAM) technology. In Advances in Non-Volatile Memory and Storage Technology, Nishi Y (ed.), Woodhead Publishing, 2014, pp.455–494. DOI: https://doi.org/10.1533/9780857098092.3.455.

  3. Yang J J, Williams R S. Memristive devices in computing system: Promises and challenges. ACM Journal on Emerging Technologies in Computing Systems, 2013, 9(2): Article No. 11. DOI: https://doi.org/10.1145/2463585.2463587.

  4. Lee S K, Mohan J, Kashyap S, Kim T, Chidambaram V. Recipe: Converting concurrent DRAM indexes to persistent-memory indexes. In Proc. the 27th ACM Symposium on Operating Systems Principles, Oct. 2019, pp.462–477. DOI: 10.1145/3341301.3359635.

  5. Kim W H, Krishnan R M, Fu X W, Kashyap S, Min C. PACTree: A high performance persistent range index using PAC guidelines. In Proc. the 28th ACM SIGOPS Symposium on Operating Systems Principles, Oct. 2021, pp.424–439. DOI: 10.1145/3477132.3483589.

  6. Chandramouli B, Prasaad G, Kossmann D, Levandoski J, Hunter J, Barnett M. FASTER: A concurrent key-value store with in-place updates. In Proc. the 2018 International Conference on Management of Data, May 2018, pp.275–290. DOI: 10.1145/3183713.3196898.

  7. Fan B, Andersen D G, Kaminsky M. MemC3: Compact and concurrent memcache with dumber caching and smarter hashing. In Proc. the 10th USENIX Symposium on Networked Systems Design and Implementation, Apr. 2013, pp.371–384.

  8. Lim H, Han D S, Andersen D G, Kaminsky M. MICA: A holistic approach to fast In-Memory Key-Value storage. In Proc. the 11th USENIX Conference on Networked Systems Design and Implementation, Apr. 2014, pp.429–444. DOI: 10.5555/2616448.2616488.

  9. Xu S T, Lee S, Jun S W, Liu M, Hicks J, Arvind N. Bluecache: A scalable distributed flash-based keyvalue store. Proceedings of the VLDB Endowment, 2016, 10(4): 301–312. DOI: https://doi.org/10.14778/3025111.3025113.

    Article  Google Scholar 

  10. Debnath B, Haghdoost A, Kadav A, Khatib M G, Ungureanu C. Revisiting hash table design for phase change memory. In Proc. the 3rd Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads, Oct. 2015. DOI: 10.1145/2819001.2819002.

  11. Zuo P F, Hua Y. A write-friendly and cache-optimized hashing scheme for non-volatile memory systems. IEEE Trans. Parallel and Distributed Systems, 2018, 29(5): 985–998. DOI: https://doi.org/10.1109/TPDS.2017.2782251.

    Article  Google Scholar 

  12. Zuo P F, Hua Y, Wei J. Write-optimized and high-performance hashing index scheme for persistent memory. In Proc. the 13th USENIX Conference on Operating Systems Design and Implementation, Oct. 2018, pp.461–476. DOI: 10.5555/3291168.3291202.

  13. Nam M, Cha H, Choi Y R, Noh S H, Nam B. Write-Optimized dynamic hashing for persistent memory. In Proc. the 17th USENIX Conference on File and Storage Technologies, Feb. 2019, pp.31–44. DOI: 10.5555/3323298.3323302.

  14. Chen Z Y, Hua Y, Ding B, Zuo P F. Lock-free concurrent level hashing for persistent memory. In Proc. the 2020 Conference on USENIX Annual Technical Conference, Jul. 2020, p.55. DOI: https://doi.org/10.5555/3489146.3489201.

  15. Lu B T, Hao X P, Wang T Z, Lo E. Dash: Scalable hashing on persistent memory. Proceedings of the VLDB Endowment, 2020, 13(8): 1147–1161. DOI: https://doi.org/10.14778/3389133.3389134.

    Article  Google Scholar 

  16. Yang J, Kim J, Hoseinzadeh M, Izraelevitz J, Swanson S. An empirical guide to the behavior and use of scalable persistent memory. In Proc. the 18th USENIX Conference on File and Storage Technologies, Feb. 2020, pp.169–182.

  17. Liang J K, Chai Y P. CruiseDB: An LSM-tree key-value store with both better tail throughput and tail latency. In Proc. the 37th IEEE International Conference on Data Engineering (ICDE), Apr. 2021, pp.1032–1043. DOI: 10.1109/ICDE51399.2021.00094.

  18. Fagin R, Nievergelt J, Pippenger N, Strong H R. Extendible hashing—A fast access method for dynamic files. ACM Trans. Database Systems, 1979, 4(3): 315–344. DOI: https://doi.org/10.1145/320083.320092.

    Article  Google Scholar 

  19. Cooper B F, Silberstein A, Tam E, Ramakrishnan R, Sears R. Benchmarking cloud serving systems with YCSB. In Proc. the 1st ACM Symposium on Cloud Computing, Jun. 2010, pp.143–154. DOI: 10.1145/1807128.1807152.

  20. Volos H, Tack A J, Swift M M. Mnemosyne: Lightweight persistent memory. In Proc. the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, Mar. 2011, pp.91–104. DOI: 10.1145/1950365.1950379.

  21. Coburn J, Caulfield A M, Akel A, Grupp L M, Gupta R K, Jhala R, Swanson S. NV-Heaps: Making persistent objects fast and safe with next-generation, non-volatile memories. In Proc. the 16th International Conference on Architectural Support for Programming Languages and Operating Systems, Mar. 2011, pp.105–118. DOI: 10.1145/1950365.1950380.

  22. Hu D K, Chen Z W, Wu J B, Sun J H, Chen H. Persistent memory hash indexes: An experimental evaluation. Proceedings of the VLDB Endowment, 2021, 14(5): 785–798. DOI: https://doi.org/10.14778/3446095.3446101.

    Article  Google Scholar 

  23. Herlihy M. Wait-free synchronization. ACM Trans. Programming Languages and Systems, 1991, 13(1): 124–149. DOI: https://doi.org/10.1145/114005.102808.

    Article  Google Scholar 

  24. David T, Guerraoui R, Trigonakis V. Asynchronized concurrency: The secret to scaling concurrent search data structures. In Proc. the 20th International Conference on Architectural Support for Programming Languages and Operating Systems, Mar. 2015, pp.631–644. DOI: 10.1145/2694344.2694359.

  25. David T, Guerraoui R. Concurrent search data structures can be blocking and practically wait-free. In Proc. the 28th ACM Symposium on Parallelism in Algorithms and Architectures, Jul. 2016, pp.337–348. DOI: 10.1145/2935764.2935774.

  26. Kaiyrakhmet O, Lee S, Nam B, Noh S H, Choi C. SLMDB: Single-level key-value store with persistent memory. In Proc. the 17th USENIX Conference on File and Storage Technologies, Feb. 2019, pp.191–205.

  27. Wei X D, Xie X T, Chen R, Chen H B, Zang B Y. Characterizing and optimizing remote persistent memory with RDMA and NVM. In Proc. the 2021 USENIX Annual Technical Conference, Jul. 2021, pp.523–536.

  28. Lersch L, Hao X P, Oukid I, Wang T Z, Willhalm T. Evaluating persistent memory range indexes. Proceedings of the VLDB Endowment, 2019, 13(4): 574–587. DOI: https://doi.org/10.14778/3372716.3372728.

    Article  Google Scholar 

  29. Desnoyers M, Mckenney P E, Stern A S, Dagenais M R, Walpole J. User-level implementations of read-copy update. IEEE Trans. Parallel and Distributed Systems, 2012, 23(2): 375–382. DOI: https://doi.org/10.1109/TPDS.2011.159.

    Article  Google Scholar 

  30. Micheal M M. Hazard pointers: Safe memory reclamation for lock-free objects. IEEE Trans. Parallel and Distributed Systems, 2004, 15(6): 491–504. DOI: https://doi.org/10.1109/TPDS.2004.8.

    Article  Google Scholar 

  31. Atikoglu B, Xu Y H, Frachtenberg E, Jiang S, Paleczny M. Workload analysis of a large-scale key-value store. In Proc. the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, Jun. 2012, pp.53–64. DOI: 10.1145/2254756.2254766.

  32. Oukid I, Lasperas J, Nica A, Willhalm T, Lehner W. FPTree: A hybrid SCM-DRAM persistent and concurrent b-tree for storage class memory. In Proc. the 2016 International Conference on Management of Data, Jul. 2016, pp.371–386. DOI: 10.1145/2882903.2915251.

  33. Kocberber O, Grot B, Picorel J, Falsafi B, Lim K, Ranganathan P. Meet the walkers: Accelerating index traversals for in-memory databases. In Proc. the 46th Annual IEEE/ACM International Symposium on Microarchitecture, Dec. 2013, pp.468–479. DOI: 10.1145/2540708.2540748.

  34. Azar Y, Broder A, Upfal E. Balanced allocations. SIAM Journal on Computing, 1999, 29(1): 180–200. DOI: https://doi.org/10.1137/S0097539795288490.

    Article  MathSciNet  MATH  Google Scholar 

  35. Li Y, Zeng L F, Chen G, Gu C H, Luo F, Ding W C, Shi Z, Fuentes J. A multi-hashing index for hybrid DRAMNVM memory systems. Journal of Systems Architecture, 2022, 128: 102547. DOI: https://doi.org/10.1016/j.sysarc.2022.102547.

    Article  Google Scholar 

  36. Benson L, Makait H, Rabl T. Viper: An efficient hybrid PMem-DRAM key-value store. Proceedings of the VLDB Endowment, 2021, 14(9): 1544–1556. DOI: https://doi.org/10.14778/3461535.3461543.

    Article  Google Scholar 

  37. Hu D K, Chen Z W, Che W K, Sun J H, Chen H. Halo: A hybrid PMem-DRAM persistent hash index with fast recovery. In Proc. the 2022 International Conference on Management of Data, Jun. 2022, pp.1049–1063. DOI: 10.1145/3514221.3517884.

  38. Lee S K, Lim K H, Song H, Nam B, Noh S H. WORT: Write optimal radix tree for persistent memory storage systems. In Proc. the 15th USENIX Conference on File and Storage Technologies, Feb. 27–Mar. 2, 2017, pp.257–270. DOI: 10.5555/3129633.3129657.

  39. Yang J, Wei Q S, Chen C, Wang C D, Yong K L, He B S. NV-Tree: Reducing consistency cost for NVM-based single level systems. In Proc. the 13th USENIX Conference on File and Storage Technologies, Feb. 2015, pp.167–181.

  40. Chen S M, Jin Q. Persistent B+-trees in non-volatile main memory. Proceedings of the VLDB Endowment, 2015, 8(7): 786–797. DOI: https://doi.org/10.14778/2752939.2752947.

    Article  Google Scholar 

  41. Lu Y S, Chang Y H, Chang Y W. WB-Trees: A meshed tree representation for finFET analog layout designs. In Proc. the 55th Annual Design Automation Conference, June 2018. DOI: 10.1145/3195970.3196137.

  42. Hwang D, Kim W H, Won Y, Nam B. Endurable transient inconsistency in Byte-Addressable persistent B+- Tree. In Proc. the 16th USENIX Conference on File and Storage Technologies, Feb. 2018, pp.187–200.

  43. Arulraj J, Levandoski J, Minhas U F, Larson P A. Bztree: A high-performance latch-free range index for non-volatile memory. Proceedings of the VLDB Endowment, 2018, 11(5): 553–565. DOI: https://doi.org/10.1145/3164135.3164147.

    Article  Google Scholar 

  44. Xia F, Jiang D J, Xiong J, Sun N H. HiKV: A hybrid index key-value store for DRAM-NVM memory systems. In Proc. the 2017 USENIX Conference on USENIX Annual Technical Conference, Jul. 2017, pp.349–362.

  45. Shalev O, Shavit N. Split-ordered lists: Lock-free extensible hash tables. Journal of the ACM, 2006, 53(3): 379–405. DOI: https://doi.org/10.1145/1147954.1147958.

    Article  MathSciNet  MATH  Google Scholar 

  46. Nguyen N, Tsigas P. Lock-free cuckoo hashing. In Proc. the 34th IEEE International Conference on Distributed Computing Systems, Jun. 2014, pp.627–636. DOI: 10.1109/ICDCS.2014.70.

  47. Lamport L. A new solution of Dijkstra’s concurrent programming problem. Communications of the ACM, 1974, 17(8): 453–455. DOI: https://doi.org/10.1145/361082.361093.

    Article  MathSciNet  MATH  Google Scholar 

  48. Fatourou P, Kallimanis N D, Ropars T. An efficient waitfree resizable hash table. In Proc. the 30th Symposium on Parallelism in Algorithms and Architectures, Jul. 2018, pp.111–120. DOI: 10.1145/3210377.3210408.

  49. David T, Guerraoui R, Trigonakis V. Everything you always wanted to know about synchronization but were afraid to ask. In Proc. the 24th CM Symposium on Operating Systems Principles, Nov. 2013, pp.33–48. DOI: 10.1145/2517349.2522714.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to De-Jun Jiang.

Supplementary Information

ESM 1

(PDF 194 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiong, ZW., Jiang, DJ., Xiong, J. et al. Dalea: A Persistent Multi-Level Extendible Hashing with Improved Tail Performance. J. Comput. Sci. Technol. 38, 1051–1073 (2023). https://doi.org/10.1007/s11390-023-2957-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-023-2957-8

Keywords

Navigation