ChainKV: A Semantics-Aware Key-Value Store for Ethereum System

Published: 12 December 2023 Publication History


The Log-Structure Merged tree (LSM-tree) based key-value (KV) store has been widely adopted as the storage engine for blockchain systems, such as Ethereum, in which blockchain data are uniformly transformed into randomly distributed KV items for persistence. However, blockchain semantics are ignored during this process, making the blockchain storage suffer from heavy read/write amplification problems. Moreover, as the Ethereum network scales up, tremendous data further exacerbates its storage burden. Until now, most studies have focused on sharding, data archiving, decentralized distributed storage, etc., to mitigate the burden of the storage layer. However, the incompatibility between Ethereum semantics and the characteristics of the storage engine is ignored.
In this paper, we present ChainKV, a new semantics-aware storage paradigm to improve the storage management performance for the Ethereum system. Firstly, based on Ethereum blockchain semantics, ChainKV separately stores different types of data in multiple storage zones in the KV store to mitigate the read/write amplification problem. Secondly, following the mechanism of the verification process in the authenticated data structure (ADS), a new ADS data transformer is proposed to exploit the data locality when persisting ADS. Moreover, a new space gaming caching policy is adopted to coordinate the cache space management for two independent storage zones. Finally, we propose an optional lightweight node crash recovery mechanism to eliminate functional redundancy between the Ethereum protocol and the storage engine. The experimental results indicate that ChainKV outperforms the prior Ethereum systems by up to 1.99× and 4.20× for synchronization and query operations, respectively


