skip to main content
10.1145/3064176.3064215acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Atomic In-place Updates for Non-volatile Main Memories with Kamino-Tx

Published: 23 April 2017 Publication History

Abstract

Data structures for non-volatile memories have to be designed such that they can be atomically modified using transactions. Existing atomicity methods require data to be copied in the critical path which significantly increases the latency of transactions. These overheads are further amplified for transactions on byte-addressable persistent memories where often the byte ranges modified for data structure updates are significantly smaller compared to the granularity at which data can be efficiently copied and logged. We propose Kamino-Tx that provides a new way to perform transactional updates on non-volatile byte-addressable memories (NVM) without requiring any copying of data in the critical path. Kamino-Tx maintains an additional copy of data off the critical path to achieve atomicity. But in doing so Kamino-Tx has to overcome two important challenges of safety and minimizing NVM storage overhead. We propose a more dynamic approach to maintaining the additional copy of data to reduce storage overheads. To further mitigate the storage overhead of using Kamino-Tx in a replicated setting, we develop Kamino-Tx-Chain, a variant of Chain Replication where replicas perform in-place updates and do not maintain data copies locally; replicas in Kamino-Tx-Chain leverage other replicas as copies to roll back or forward for atomicity. Our results show that using Kamino-Tx increases throughput by up to 9.5x for unreplicated systems and up to 2.2x for replicated settings.

References

[1]
System Support for NVMs in Linux. http://nvdimm.wiki.kernel.org.
[2]
Amazon Web Services, Inc. AWS Total Cost of Ownership (TCO) Calculator, May 2016. Available at https://awstcocalculator.com.
[3]
D. G. Andersen, J. Franklin, M. Kaminsky, A. Phanishayee, L. Tan, and V. Vasudevan. FAWN: A Fast Array of Wimpy Nodes. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP'09, pages 1--14. ACM, 2009.
[4]
J. Arulraj, A. Pavlo, and S. R. Dulloor. Let's Talk About Storage and Recovery Methods for Non-volatile Memory Database Systems. In Proceedings of ACM SIGMOD 2015, 2015.
[5]
K. Bhandari, D. R. Chakrabarti, and H. Boehm. Implications of CPU Caching on Byte-addressable Non-volatile Memory Programming, 2012.
[6]
B. Bridge. NVM Support for C Applications, 2015. Available at http://www.snia.org/sites/default/files/BillBridgeNVMSummit2015Slides.pdf.
[7]
B. Calder, J. Wang, A. Ogus, N. Nilakantan, A. Skjolsvold, S. McKelvie, Y. Xu, S. Srivastav, J. Wu, H. Simitci, J. Haridas, C. Uddaraju, H. Khatri, A. Edwards, V. Bedekar, S. Mainali, R. Abbasi, A. Agarwal, M. F. u. Haq, M. I. u. Haq, D. Bhardwaj, S. Dayanand, A. Adusumilli, M. McNett, S. Sankaran, K. Manivannan, and L. Rigas. Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles, SOSP'11, pages 143--157, New York, NY, USA, 2011. ACM.
[8]
J. Coburn, A. M. Caulfield, A. Akel, L. M. Grupp, R. K. Gupta, R. Jhala, and S. Swanson. NV-Heaps: Making Persistent Objects Fast and Safe with Next-generation, Non-volatile Memories. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 105--118, New York, NY, USA, 2011. ACM.
[9]
J. Condit, E. B. Nightingale, C. Frost, E. Ipek, D. Burger, B. Lee, and D. Coetzee. Better I/O Through Byte-addressable, Persistent Memory. In Proceedings of ACM SOSP 2009, 2009.
[10]
B. F. Cooper, A. Silberstein, E. Tam, R. Ramakrishnan, and R. Sears. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing, SoCC'10, pages 143--154, New York, NY, USA, 2010. ACM.
[11]
A. Dragojević, D. Narayanan, E. B. Nightingale, M. Renzelmann, A. Shamis, A. Badam, and M. Castro. No Compromises: Distributed Transactions with Consistency, Availability, and Performance. In Proceedings of the 25th Symposium on Operating Systems Principles, SOSP'15, pages 54--70, New York, NY, USA, 2015. ACM.
[12]
S. R. Duloor, S. Kumar, A. Keshavamurthy, D. Reddy, R. Sankaran, and J. Jackson. System Software for Persistent Memory. In Proceedings of the Ninth European Conference of Computer Systems, 2014.
[13]
B. Fan, D. G. Andersen, and M. Kaminsky. MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI'13), pages 371--384, Lombard, IL, 2013. USENIX.
[14]
R. Fang, H.-I. Hsiao, B. He, C. Mohan, and Y. Wang. High Performance Database Logging Using Storage Class Memory. In Proc. 27th IEEE ICDE'11, Hanover, Germany, 2011.
[15]
M. J. Franklin. Concurrency Control and Recovery. The Computer Science and Engineering Handbook, pages 1058--1077, 1997.
[16]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google File System. In Proc. 19th ACM Symposium on Operating Systems Principles (SOSP), Lake George, NY, Oct. 2003.
[17]
E. Giles, K. Doshi, and P. Varman. Bridging the Programming Gap Between Persistent and Volatile Memory Using WrAP. In Proceedings of the ACM International Conference on Computing Frontiers, CF'13, pages 30:1--30:10, New York, NY, USA, 2013. ACM.
[18]
J. Gray, P. McJones, M. Blasgen, B. Lindsay, R. Lorie, T. Price, F. Putzolu, and I. Traiger. The Recovery Manager of the System R Database Manager. ACM Comput. Surv., 13 (2):223--242, June 1981.
[19]
J. Huang, K. Schwan, and M. K. Qureshi. NVRAM-aware Logging in Transaction Systems. In Proceedings of the Forty First International Conference on Very Large Data Bases, Aug. 2015.
[20]
Intel Corporation. Persistent Memory Programming, 2015. Available at http://pmem.io/nvml/.
[21]
A. Kalia, M. Kaminsky, and D. G. Andersen. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-sided (RDMA) Datagram RPCs. In Proc. USENIX OSDI'16, Savannah, GA, 2016.
[22]
E. Lee, H. Bahn, and S. H. Noh. Unioning of the Buffer Cache and Journaling Layers with Non-volatile Memory. In Proc. FAST'13, San Jose, CA, Feb. 2013.
[23]
C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. ARIES: A Transaction Recovery Method Supporting Fine-granularity Locking and Partial Rollbacks Using Write-ahead Logging. ACM Trans. Database Syst., 17(1):94--162, Mar. 1992.
[24]
MongoDB. http://mongodb.com.
[25]
D. Narayanan and O. Hodson. Whole-system Persistence. In Proceedings of the Seventeenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVII, pages 401--410, New York, NY, USA, 2012. ACM.
[26]
Oracle Corporation. The InnoDB Recovery Process, 2016. Available at https://dev.mysql.com/doc/refman/5.1/en/innodb-recovery.html.
[27]
S. Park, T. Kelly, and K. Shen. Failure-atomic Msync(): A Simple and Efficient Mechanism for Preserving the Integrity of Durable Data. In Proceedings of the 8th ACM European Conference on Computer Systems, EuroSys'13, pages 225--238, New York, NY, USA, 2013. ACM.
[28]
A. Phanishayee, D. G. Andersen, H. Pucha, A. Povzner, and W. Belluomini. Flex-KV: Enabling High-performance and Flexible KV Systems. In Proceedings of the 2012 Workshop on Management of Big Data Systems, MBDS'12, pages 19--24, New York, NY, USA, 2012. ACM.
[29]
J. Terrace and M. J. Freedman. Object Storage on CRAQ: High-throughput Chain Replication for Read-mostly Workloads. In Proceedings of the 2009 Conference on USENIX Annual Technical Conference, USENIX'09, Berkeley, CA, USA, 2009. USENIX Association.
[30]
R. van Renesse and F. B. Schneider. Chain Replication for Supporting High Throughput and Availability. In Proceedings of the 6th Conference on Symposium on Operating Systems Design & Implementation - Volume 6, OSDI'04, Berkeley, CA, USA, 2004. USENIX Association.
[31]
S. Venkataraman, N. Tolia, P. Ranganathan, and R. H. Campbell. Consistent and Durable Data Structures for Non-volatile Byte-addressable Memory. In Proceedings of the Ninth USENIX Conference on File and Storage Technologies, 2011.
[32]
H. Volos, A. J. Tack, and M. M. Swift. Mnemosyne: Lightweight Persistent Memory. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 91--104, New York, NY, USA, 2011. ACM.
[33]
T. Wang and R. Johnson. Scalable Logging Through Emerging Non-volatile Memory. Proceedings of the VLDB Endowment, 7(10):865--876, June 2014.
[34]
J. Xu and S. Swanson. NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In 14th USENIX Conference on File and Storage Technologies (FAST'16), pages 323--338, Santa Clara, CA, Feb. 2016. USENIX Association.
[35]
Y. Zhang, J. Yang, A. Memaripour, and S. Swanson. Mojim: A Reliable and Highly-available Non-volatile Memory System. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS'15, pages 3--18, New York, NY, USA, 2015. ACM.
[36]
J. Zhao, S. Li, D. H. Yoon, Y. Xie, and N. P. Jouppi. Kiln: Closing the Performance Gap Between Systems with and without Persistence Support. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-46, pages 421--432, New York, NY, USA, 2013. ACM.

Cited By

View all
  • (2024)A quantitative evaluation of persistent memory hash indexesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00812-133:2(375-397)Online publication date: 1-Mar-2024
  • (2024)Intel PMDK Transactions: Specification, Validation and ConcurrencyProgramming Languages and Systems10.1007/978-3-031-57267-8_6(150-179)Online publication date: 6-Apr-2024
  • (2023)Progress on storage systems for disaggregated data centersSCIENTIA SINICA Informationis10.1360/SSI-2023-003453:8(1503)Online publication date: 17-Aug-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '17: Proceedings of the Twelfth European Conference on Computer Systems
April 2017
648 pages
ISBN:9781450349383
DOI:10.1145/3064176
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 April 2017

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

EuroSys '17
Sponsor:
EuroSys '17: Twelfth EuroSys Conference 2017
April 23 - 26, 2017
Belgrade, Serbia

Acceptance Rates

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A quantitative evaluation of persistent memory hash indexesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00812-133:2(375-397)Online publication date: 1-Mar-2024
  • (2024)Intel PMDK Transactions: Specification, Validation and ConcurrencyProgramming Languages and Systems10.1007/978-3-031-57267-8_6(150-179)Online publication date: 6-Apr-2024
  • (2023)Progress on storage systems for disaggregated data centersSCIENTIA SINICA Informationis10.1360/SSI-2023-003453:8(1503)Online publication date: 17-Aug-2023
  • (2023)General-purpose Asynchronous Periodic Checkpointing in Hybrid MemoryProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605648(675-684)Online publication date: 7-Aug-2023
  • (2023)Memento: A Framework for Detectable Recoverability in Persistent MemoryProceedings of the ACM on Programming Languages10.1145/35912327:PLDI(292-317)Online publication date: 6-Jun-2023
  • (2023)SpecPMT: Speculative Logging for Resolving Crash Consistency Overhead of Persistent MemoryProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575696(762-777)Online publication date: 27-Jan-2023
  • (2023)TL4xProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577495(245-259)Online publication date: 25-Feb-2023
  • (2023)Software-Defined, Fast and Strongly-Consistent Data Replication for RDMA-Based PM Datastores2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS54959.2023.00019(90-101)Online publication date: May-2023
  • (2023)Rambda: RDMA-driven Acceleration Framework for Memory-intensive µs-scale Datacenter Applications2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071127(499-515)Online publication date: Feb-2023
  • (2023)Silo: Speculative Hardware Logging for Atomic Durability in Persistent Memory2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071034(651-663)Online publication date: Feb-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media