skip to main content
research-article

ChainKV: A Semantics-Aware Key-Value Store for Ethereum System

Published: 12 December 2023 Publication History

Abstract

The Log-Structure Merged tree (LSM-tree) based key-value (KV) store has been widely adopted as the storage engine for blockchain systems, such as Ethereum, in which blockchain data are uniformly transformed into randomly distributed KV items for persistence. However, blockchain semantics are ignored during this process, making the blockchain storage suffer from heavy read/write amplification problems. Moreover, as the Ethereum network scales up, tremendous data further exacerbates its storage burden. Until now, most studies have focused on sharding, data archiving, decentralized distributed storage, etc., to mitigate the burden of the storage layer. However, the incompatibility between Ethereum semantics and the characteristics of the storage engine is ignored.
In this paper, we present ChainKV, a new semantics-aware storage paradigm to improve the storage management performance for the Ethereum system. Firstly, based on Ethereum blockchain semantics, ChainKV separately stores different types of data in multiple storage zones in the KV store to mitigate the read/write amplification problem. Secondly, following the mechanism of the verification process in the authenticated data structure (ADS), a new ADS data transformer is proposed to exploit the data locality when persisting ADS. Moreover, a new space gaming caching policy is adopted to coordinate the cache space management for two independent storage zones. Finally, we propose an optional lightweight node crash recovery mechanism to eliminate functional redundancy between the Ethereum protocol and the storage engine. The experimental results indicate that ChainKV outperforms the prior Ethereum systems by up to 1.99× and 4.20× for synchronization and query operations, respectively

References

[1]
2009. Bitcoin. https://bitcoin.org/en/.
[2]
2013. LevelDB. http://github.com/google/leveldb.
[3]
2013. Rocksdb. http://github.com/facebook/rocksdb.
[4]
2014. Cpp-ethereum. https://github.com/ethereum/aleth.
[5]
2014. Go-Ethereum. https://github.com/ethereum/go-ethereum.
[6]
2015. Parity. https://github.com/openethereum/parity-ethereum.
[7]
2017. GolevelDB. https://github.com/syndtr/goleveldb.
[8]
2021. Connect your smart contract to the outside world. https://chain.link/.
[9]
2023. Cosmos Hub (Gaia). :https://github.com/cosmos/gaia.git.
[10]
Mustafa Al-Bassam, Alberto Sonnino, Shehar Bano, Dave Hrycyszyn, and George Danezis. 2018. Chainspace: A Sharded Smart Contracts Platform. In The Internet Society 25th Annual Network and Distributed System Security Symposium (NDSS).
[11]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceeding of the 12th ACM joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS).
[12]
J.D Bruce. 2014. The mini-blockchain scheme. White paper (2014).
[13]
Zhichao Cao, Siying Dong, Sagar Vemuri, and David H. C. Du. 2020. Characterizing, Modeling, and Benchmarking RocksDB Key-Value Workloads at Facebook. In 18th USENIX Conference on File and Storage Technologies (FAST). 209--223.
[14]
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. 2008. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS) 26, 2 (2008), 1--26.
[15]
Hao Chen, Chaoyi Ruan, Cheng Li, Xiaosong Ma, and Yinlong Xu. 2021. SpanDB: A Fast, Cost-Effective LSM-tree Based KV Store on Hybrid Storage. In 19th USENIX Conference on File and Storage Technologies (FAST). 17--32.
[16]
Zehao Chen, Bingzhe Li, Xiaojun Cai, Zhiping Jia, Zhaoyan Shen, Yi Wang, and Zili Shao. 2021. Block-LSM: An Ether-aware Block-ordered LSM-tree based Key-Value Storage Engine. In 2021 IEEE 39th International Conference on Computer Design (ICCD). 25--32.
[17]
Jemin Andrew Choi, Sidi Mohamed Beillahi, Peilun Li, Andreas G. Veneris, and Fan Long. 2022. LMPTs: Eliminating Storage Bottlenecks for Processing Blockchain Transactions. In IEEE International Conference on Blockchain and Cryptocurrency (ICBC). 1--9.
[18]
Hung Dang, Tien Tuan Anh Dinh, Dumitrel Loghin, Ee-Chien Chang, Qian Lin, and Beng Chin Ooi. 2019. Towards Scaling Blockchain Systems via Sharding. In ACM Proceedings of the 2019 International Conference on Management of Data (SIGMOD). 123--140.
[19]
Hung Dang, Tien Tuan Anh Dinh, Dumitrel Loghin, Ee-Chien Chang, Qian Lin, and Beng Chin Ooi. 2019. Towards scaling blockchain systems via sharding. In Proceedings of the 2019 international conference on management of data (MOD). 123--140.
[20]
Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal Navigable Key-Value Store. In Proceedings of the 2017 ACM International Conference on Management of Data, (SIGMOD).
[21]
Ali Dorri, Salil S. Kanhere, Raja Jurdak, and Praveen Gauravaram. 2017. Blockchain for IoT security and privacy: The case study of a smart home. In 2017 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). 618--623.
[22]
WG Ethereum. 2014. A secure decentralised generalised transaction ledger [J]. Ethereum project yellow paper 151 (2014), 1--32.
[23]
Junying Gao, Bo Li, and Zhihuai Li. 2018. Blockchain Storage Analysis and Optimization of Bitcoin Miner Node. In International Conference in Communications, Signal Processing, and Systems (CSPS), Vol. 517.
[24]
Haya R. Hasan and Khaled Salah. 2019. Combating Deepfake Videos Using Blockchain and Smart Contracts. IEEE Access 7 (2019), 41596--41606.
[25]
Kecheng Huang, Zhiping Jia, Zhaoyan Shen, Zili Shao, and Feng Chen. 2021. Less is More: De-amplifying I/Os for Key-value Stores with a Log-assisted LSM-tree. In IEEE International Conference on Data Engineering (ICDE). 612--623.
[26]
Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2018. Redesigning LSMs for Nonvolatile Memory with NoveLSM. In USENIX Annual Technical Conference (ATC).
[27]
J.R. Kosinski. 2021. Ethereum oracle contracts: Setup and orientation. :https://www.toptal.com/ethereum/ethereum-oraclecontracts-tutorial-pt1/.
[28]
Yongkun Li, Chengjin Tian, Fan Guo, Cheng Li, and Yinlong Xu. 2019. ElasticBF: Elastic Bloom Filter with Hotness Awareness for Boosting Read Performance in Large Key-Value Stores. In USENIX Annual Technical Conference (ATC).
[29]
Yang Li, Kai Zheng, Ying Yan, Qi Liu, and Xiaofang Zhou. 2017. EtherQL: A Query Layer for Blockchain System. In Springer International Conference on Database Systems for Advanced Applications (DASFAA).
[30]
Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. WiscKey: Separating Keys from Values in SSD-conscious Storage. In 14th USENIX Conference on File and Storage Technologies (FAST). 133--148.
[31]
Loi Luu, Viswesh Narayanan, Chaodong Zheng, Kunal Baweja, Seth Gilbert, and Prateek Saxena. 2016. A Secure Sharding Protocol For Open Blockchains. In ACM Proceedings of the 2016 SIGSAC Conference on Computer and Communications Security (CCS). 17--30.
[32]
Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A Self-Tuning, Low Overhead Replacement Cache. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST).
[33]
Asutosh Palai, Meet Vora, and Aashaka Shah. 2018. Empowering Light Nodes in Blockchains with Block Summarization. In IFIP International Conference on New Technologies, Mobility and Security (NTMS).
[34]
Christos Patsonakis and Mema Roussopoulos. 2019. An Alternative Paradigm for Developing and Pricing Storage on Smart Contract Platforms. In IEEE International Conference on Decentralized Applications and Infrastructures (DAPPCON).
[35]
Zhe Peng, Haotian Wu, Bin Xiao, and Songtao Guo. 2019. VQL: Providing Query Efficiency and Data Authenticity in Blockchain Systems. In IEEE International Conference on Data Engineering Workshops (ICDE).
[36]
Soujanya Ponnapalli, Aashaka Shah, Souvik Banerjee, Dahlia Malkhi, Amy Tai, Vijay Chidambaram, and Michael Wei. 2021. RainBlock: Faster Transaction Processing in Public Blockchains. In USENIX Annual Technical Conference (ATC).
[37]
Soujanya Ponnapalli, Aashaka Shah, Amy Tai, Souvik Banerjee, Vijay Chidambaram, Dahlia Malkhi, and Michael Wei. 2019. Scalable and Efficient Data Authentication for Decentralized Systems. Computing Research Repository abs/1909.11590 (2019).
[38]
Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building Key-Value Stores using Fragmented Log-Structured Merge Trees. In Proceedings of the 26th ACM Symposium on Operating Systems Principles (SOSP).
[39]
Pandian Raju, Soujanya Ponnapalli, Evan Kaminsky, Gilad Oved, Zachary Keener, Vijay Chidambaram, and Ittai Abraham. 2018. mLSM: Making Authenticated Storage Faster in Ethereum. In USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage).
[40]
Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. 2017. SlimDB: A Space-Efficient Key-Value Storage Engine For Semi-Sorted Data. Proceedings of the VLDB Endowment. 10, 13 (2017), 2037--2048.
[41]
Nick Szabo. 1994. Smart contracts.
[42]
Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Siyuan Ma, Yanfeng Zhang, and Xiaodong Zhang. 2017. LSbM-tree: Re-Enabling Buffer Caching in Data Management for Mixed Reads and Writes. In IEEE International Conference on Distributed Computing Systems (ICDCS).
[43]
Mehul Nalin Vora. 2011. Hadoop-HBase for large-scale data. In Proceedings of 2011 IEEE International Conference on Computer Science and Network Technology, Vol. 1. 601--605.
[44]
Jiaping Wang and Hao Wang. 2019. Monoxide: Scale out Blockchains with Asynchronous Consensus Zones. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI). 95--112.
[45]
Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In ACM European Conference on Computer Systems (EuroSys).
[46]
Qian Wei, Bingzhe Li, Wanli Chang, Zhiping Jia, Zhaoyan Shen*, and Zili Shao. 2021. A Survey of Blockchain Data Systems (TECS) (2021).
[47]
Ji Wong. 2017. Cryptokitties is causing Ethereum network congestion.
[48]
Xiwei Xu, Cesare Pautasso, Liming Zhu, Vincent Gramoli, Alexander Ponomarev, An Binh Tran, and Shiping Chen. 2016. The Blockchain as a Software Connector. In 13th Working IEEE/IFIP Conference on Software Architecture (WICSA). 182--191.
[49]
Yibin Xu. 2018. Section-Blockchain: A Storage Reduced Blockchain Protocol, the Foundation of an Autotrophic Decentralized Storage Architecture. In IEEE International Conference on Engineering of Complex Computer Systems (ICECCS).
[50]
Ting Yao, Yiwen Zhang, Jiguang Wan, Qiu Cui, Liu Tang, Hong Jiang, Changsheng Xie, and Xubin He. 2020. MatrixKV: Reducing Write Stalls and Write Amplification in LSM-tree Based KV Stores with Matrix Container in NVM. In USENIX Annual Technical Conference (ATC).
[51]
Mahdi Zamani, Mahnush Movahedi, and Mariana Raykova. 2018. Rapidchain: Scaling blockchain via full sharding. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security (CCS). 931--948.
[52]
Huijuan Zhang, Chengxin Jin, and Hejie Cui. 2018. A method to predict the performance and storage of executing contract for ethereum consortium-blockchain. In Springer International Conference on Blockchain.
[53]
Qiuhong Zheng, Yi Li, Ping Chen, and Xinghua Dong. 2018. An Innovative IPFS-Based Storage Model for Blockchain. In IEEE/WIC/ACM International Conference on Web Intelligence (WI).

Cited By

View all
  • (2025)InTime: Towards Performance Predictability In Byzantine Fault Tolerant Proof-of-Stake ConsensusProceedings of the ACM on Management of Data10.1145/37097403:1(1-27)Online publication date: 11-Feb-2025
  • (2025)Aster: Enhancing LSM-structures for Scalable Graph DatabaseProceedings of the ACM on Management of Data10.1145/37096623:1(1-26)Online publication date: 11-Feb-2025
  • (2024)CGgraph: An Ultra-Fast Graph Processing System on Modern Commodity CPU-GPU Co-processorProceedings of the VLDB Endowment10.14778/3648160.364817917:6(1405-1417)Online publication date: 3-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 1, Issue 4
PACMMOD
December 2023
1317 pages
EISSN:2836-6573
DOI:10.1145/3637468
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 December 2023
Published in PACMMOD Volume 1, Issue 4

Permissions

Request permissions for this article.

Author Tags

  1. ChainKV
  2. semantics-aware KV store
  3. storage bottleneck in blockchain systems

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)398
  • Downloads (Last 6 weeks)19
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)InTime: Towards Performance Predictability In Byzantine Fault Tolerant Proof-of-Stake ConsensusProceedings of the ACM on Management of Data10.1145/37097403:1(1-27)Online publication date: 11-Feb-2025
  • (2025)Aster: Enhancing LSM-structures for Scalable Graph DatabaseProceedings of the ACM on Management of Data10.1145/37096623:1(1-26)Online publication date: 11-Feb-2025
  • (2024)CGgraph: An Ultra-Fast Graph Processing System on Modern Commodity CPU-GPU Co-processorProceedings of the VLDB Endowment10.14778/3648160.364817917:6(1405-1417)Online publication date: 3-May-2024
  • (2024)PreLog: A Pre-trained Model for Log AnalyticsProceedings of the ACM on Management of Data10.1145/36549662:3(1-28)Online publication date: 30-May-2024
  • (2024)Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table RepresentationsProceedings of the ACM on Management of Data10.1145/36549252:3(1-27)Online publication date: 30-May-2024
  • (2024)A Semantic-Integrated LSM-Tree-Based Key–Value Storage Engine for Blockchain SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.334877143:6(1794-1807)Online publication date: Jun-2024
  • (2023)Homomorphic Compression: Making Text Processing on Compression UnlimitedProceedings of the ACM on Management of Data10.1145/36267651:4(1-28)Online publication date: 12-Dec-2023
  • (2023)RECom: A Compiler Approach to Accelerating Recommendation Model Inference with Massive Embedding ColumnsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 410.1145/3623278.3624761(268-286)Online publication date: 25-Mar-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media