Abstract
With its promise of transparency, security, and decentralization, blockchain technology faces significant challenges related to data storage and query efficiency. Current indexing methods, which often rely on structures like Merkle trees and Patricia tries, contribute to excessive storage overhead and slower query responses, particularly for full nodes that maintain a complete copy of the blockchain. To address this, we introduce a novel-learned indexing approach for blockchain that utilizes a layered structure with a sliding window search enhanced Online Gradient Descent (SWS-OGD) as the inter-block index. The method was implemented across five distinct blockchain environments—Bitcoin, Ethereum, Dogecoin, Litecoin, and IoTeX. Experimental results demonstrate that the proposed method reduces storage costs by up to 99% compared to state-of-the-art approaches, requiring as little as 0.9 KB for 20,000 blocks-a substantial improvement over existing methods. Despite the significant reduction in storage costs, the SWS-OGD method maintains comparable performance in other key metrics, such as query latency. These results ensure that blockchain systems can handle large-scale data queries efficiently, maintaining high performance even as the blockchain grows in size.














Similar content being viewed by others
Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.Data Availability
The data used in this research will be made available upon request.
Code availability
The code used for the experiments will be made available upon request.
References
Javaid M, Haleem A, Pratap Singh R, Khan S, Suman R (2021) Blockchain technology applications for industry 4.0: a literature-based review. Blockchain Res Appl 2(4):100027
Gad AG, Mosa DT, Abualigah L, Abohany AA (2022) Emerging trends in blockchain technology and applications: a review and outlook. J King Saud Univ Comput Inf Sci 34(9):6719–6742. https://doi.org/10.1016/j.jksuci.2022.03.007
Ali V, Norman AA, Azzuhri SRB (2023) Characteristics of blockchain and its relationship with trust. IEEE Access 11:15364–15374. https://doi.org/10.1109/ACCESS.2023.3243700
Wang J, Chen W, Wang L, Sherratt RS, Alfarraj O, Tolba A (2020) Data secure storage mechanism of sensor networks based on blockchain. Comput Mater Continua 65(3):2365–2384
Sunny J, Undralla N, Madhusudanan Pillai V (2020) Supply chain transparency through blockchain-based traceability: an overview with demonstration. Comput Ind Eng 150:106895. https://doi.org/10.1016/j.cie.2020.106895
Zaabar B, Cheikhrouhou O, Jamil F, Ammi M, Abid M (2021) Healthblock: a secure blockchain-based healthcare data management system. Comput Netw 200:108500. https://doi.org/10.1016/j.comnet.2021.108500
Hewa TM, Hu Y, Liyanage M, Kanhare SS (2021) Ylianttila M survey on blockchain-based smart contracts: technical aspects and future research. IEEE Access 9:87643–87662. https://doi.org/10.1109/ACCESS.2021.3068178
Ameyaw PD, Vries WT (2021) Toward smart land management: land acquisition and the associated challenges in ghana a look into a blockchain digital land registry for prospects. Land. https://doi.org/10.3390/land10030239
Musah S, Medeni TD, Soylu D (2019) Assessment of role of innovative technology through blockchain technology in ghana’s cocoa beans food supply chains. In: 2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), pp. 1–12. https://doi.org/10.1109/ISMSIT.2019.8932936
Gyimah KN, Asiedu E, Antwi F (2023) Adoption of blockchain technology in the banking sector of ghana: opportunities and challenges. Afr J Bus Manage 17(2):32–42
Akrasi-Mensah NK, Tchao ET, Sikora A, Agbemenu AS, Nunoo-Mensah H, Ahmed A-R, Welte D, Keelson E (2022) An overview of technologies for improving storage efficiency in blockchain-based iiot applications. Electronics. https://doi.org/10.3390/electronics11162513
XiaoJu H, XueQing G, ZhiGang H, LiMei Z, Kun G (2020) Ebtree: A b-plus tree based index for ethereum blockchain data. In: Proceedings of the 2020 Asia Service Sciences and Software Engineering Conference. ASSE ’20, pp. 83–90. Association for Computing Machinery, New York, NY, USA https://doi.org/10.1145/3399871.3399892
Jia D-Y, Xin J-C, Wang Z-Q, Lei H, Wang G-R (2021) Se-chain: a scalable storage and efficient retrieval model for blockchain. J Comput Sci Technol 36(3):693–706. https://doi.org/10.1007/s11390-020-0158-2
Zhu Y, Zhang Z, Jin C, Zhou A, Yan Y (2019) Sebdb: semantics empowered blockchain database. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), pp. 1820–1831 https://doi.org/10.1109/ICDE.2019.00198
Bitcoin blockchain size. https://ycharts.com/indicators/bitcoin_blockchain_size
Li Y, Zheng K, Yan Y, Liu Q, Zhou X (2017) Etherql: A query layer for blockchain system. In: Candan S, Chen L, Pedersen TB, Chang L, Hua W (eds) Database Systems for Advanced Applications. Springer, Cham, pp 556–567
Zhang Z, Zhong Y, Yu X (2021) Blockchain storage middleware based on external database. In: 2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP), pp. 1301–1304 https://doi.org/10.1109/ICSP51882.2021.9408752
Pratama, F.A., Mutijarsa, K.: Query support for data processing and analysis on ethereum blockchain. In: 2018 International Symposium on Electronics and Smart Devices (ISESD), pp. 1–5 (2018). https://doi.org/10.1109/ISESD.2018.8605476
Laishevskiy I, Barger A, Gorgadze V (2023) A journey towards the most efficient state database for hyperledger fabric. In: 2023 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pp. 1–3 https://doi.org/10.1109/ICBC56567.2023.10174970
Bragagnolo S, Marra M, Polito G, Gonzalez Boix E (2019) Towards scalable blockchain analysis. In: 2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB), pp. 1–7 https://doi.org/10.1109/WETSEB.2019.00007
El-Hindi M, Binnig C, Arasu A, Kossmann D, Ramamurthy R (2019) Blockchaindb: a shared database on blockchains. Proc VLDB Endow 12(11):1597–1609. https://doi.org/10.14778/3342263.3342636
Helmer S, Roggia M, Ioini NE, Pahl C (2018) Ethernitydb - integrating database functionality into a blockchain. In: Benczúr A, Thalheim B, Horváth T, Chiusano S, Cerquitelli T, Sidló C, Revesz PZ (eds) New trends in databases and information systems. Springer, Cham, pp 37–44
Sahoo MS, Baruah PK (2018) Hbasechaindb - a scalable blockchain framework on hadoop ecosystem. In: Yokota R, Wu W (eds) Supercomputing frontiers. Springer, Cham, pp 18–29
Abuhashim A, Tan CC (2020) Smart contract designs on blockchain applications. In: 2020 IEEE Symposium on Computers and Communications (ISCC), pp. 1–4 https://doi.org/10.1109/ISCC50000.2020.9219622
Thabet NA, Abdelbaki N (2021) Efficient quering blockchain applications. In: 2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES), pp. 365–369 https://doi.org/10.1109/NILES53778.2021.9600533
Gürsoy G, Brannon CM, Gerstein M (2020) Using ethereum blockchain to store and query pharmacogenomics data via smart contracts. BMC Med Genomics 13(1):74. https://doi.org/10.1186/s12920-020-00732-x
Chishti MS, Sufyan F, Banerjee A (2022) Decentralized on-chain data access via smart contracts in ethereum blockchain. IEEE Trans Netw Serv Manage 19(1):174–187. https://doi.org/10.1109/TNSM.2021.3120912
Han J, Seo Y, Lee S, Kim S, Son Y (2023) Design and implementation of enabling sql –query processing for ethereum-based blockchain systems. Electronics. https://doi.org/10.3390/electronics12204317
Mardiansyah V, Muis A, Sari RF (2023) Multi-state merkle patricia trie (msmpt): high-performance data structures for multi-query processing based on lightweight blockchain. IEEE Access 11:117282–117296. https://doi.org/10.1109/ACCESS.2023.3325748
Huang T-L, Huang J (2022) An efficient storage structure and management for distributed ledgers in blockchain systems: an exploration based on purely theoretical approach. IEEE Trans Netw Serv Manage 19(4):3706–3723. https://doi.org/10.1109/TNSM.2022.3195246
Liu M, Wang H, Yang F (2021) An efficient data query method of blockchain based on index. In: 2021 7th International Conference on Computer and Communications (ICCC), pp. 1539–1544 https://doi.org/10.1109/ICCC54389.2021.9674708
Du P, Liu Y, Li Y, Yin H, Zhang L (2021) Etherh: A hybrid index to support blockchain data query. In: Proceedings of the ACM Turing Award Celebration Conference - China. ACM TURC ’21, pp. 72–76. Association for Computing Machinery, New York, NY, USA https://doi.org/10.1145/3472634.3472653
Zeng L, Qiu W, Wang X, Wang H, Yao Y, Yu Z (2021) Transaction-based static indexing method to improve the efficiency of query on the blockchain. In: 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA), pp. 780–784 https://doi.org/10.1109/ICAICA52286.2021.9497966
Wan L (2021) A query optimization method of blockchain electronic transaction based on group account. In: Atiquzzaman M, Yen N, Xu Z (eds) Big data analytics for cyber-physical system in smart city. Springer, Singapore, pp 1358–1364
Pei Q, Zhou E, Xiao Y, Zhang D, Zhao D (2020) An efficient query scheme for hybrid storage blockchains based on merkle semantic trie. In: 2020 International Symposium on Reliable Distributed Systems (SRDS), pp. 51–60 https://doi.org/10.1109/SRDS51746.2020.00013
Ruan P, Dinh TTA, Lin Q, Zhang M, Chen G, Ooi BC (2021) Lineagechain: a fine-grained, secure and efficient data provenance system for blockchains. VLDB J 30(1):3–24. https://doi.org/10.1007/s00778-020-00646-1
Zhu Y, Zhang Z, Jin C, Zhou A, Yan Y (2019) Sebdb: Semantics empowered blockchain database. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), pp. 1820–1831 https://doi.org/10.1109/ICDE.2019.00198
Xing X, Chen Y, Li T, Xin Y, Sun H (2021) A blockchain index structure based on subchain query. J Cloud Comput 10(1):52. https://doi.org/10.1186/s13677-021-00268-0
Xu C, Zhang C, Xu J (2019) vchain: Enabling verifiable boolean range queries over blockchain databases. In: Proceedings of the 2019 International Conference on Management of Data. SIGMOD ’19, pp. 141–158. Association for Computing Machinery, New York, NY, USA https://doi.org/10.1145/3299869.3300083
Hao K, Xin J, Wang Z, Yao Z, Wang G (2022) On efficient top-k transaction path query processing in blockchain database. Data Knowl Eng 141:102079. https://doi.org/10.1016/j.datak.2022.102079
Kraska T, Beutel A, Chi EH, Dean J, Polyzotis N (2018) The case for learned index structures. In: Proceedings of the 2018 International Conference on Management of Data. SIGMOD ’18, pp. 489–504. Association for Computing Machinery, New York, NY, USA https://doi.org/10.1145/3183713.3196909
Ding J, Minhas UF, Yu J, Wang C, Do J, Li Y, Zhang H, Chandramouli B, Gehrke J, Kossmann D, Lomet D, Kraska T (2020) Alex: An updatable adaptive learned index. In: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. SIGMOD ’20, pp. 969–984. Association for Computing Machinery, New York, NY, USA https://doi.org/10.1145/3318464.3389711
Ge J, Zhang H, Shi B, Luo Y, Guo Y, Chai Y, Chen Y, Pan A (2023) Sali: a scalable adaptive learned index framework based on probability models. Proc ACM Manag Data. https://doi.org/10.1145/3626752
Zhang C, Xu C, Hu H, Xu J (2024) Cole: A column-based learned storage for blockchain systems (technical report)
Yao Z, Xin J, Hao K, Wang Z, Zhu W (2023) Learned-index-based semantic keyword query on blockchain. Mathematics. https://doi.org/10.3390/math11092055
Chang J, Li B, Xiao J, Lin L, Jin H (2023) Anole: a lightweight and verifiable learned-based index for time range query on blockchain systems. In: Wang X, Sapino ML, Han W-S, El Abbadi A, Dobbie G, Feng Z, Shao Y, Yin H (eds) Database systems for advanced applications. Springer, Cham, pp 519–534
Hoi SCH, Sahoo D, Lu J, Zhao P (2021) Online learning: a comprehensive survey. Neurocomputing 459:249–289. https://doi.org/10.1016/j.neucom.2021.04.112
Zhang J, Sun Y, Guo D, Luo L, Li L, Nian Q, Zhu S, Yang F (2024) A reputation awareness randomization consensus mechanism in blockchain systems. IEEE Internet Things J 11(20):32745–32758. https://doi.org/10.1109/JIOT.2024.3408846
Hoi SCH, Sahoo D, Lu J, Zhao P (2021) Online learning: a comprehensive survey. Neurocomputing 459:249–289. https://doi.org/10.1016/j.neucom.2021.04.112
Funding
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Asiamah, E.A., Akrasi-Mensah, N.K., Odame, P. et al. A storage-efficient learned indexing for blockchain systems using a sliding window search enhanced online gradient descent. J Supercomput 81, 321 (2025). https://doi.org/10.1007/s11227-024-06805-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06805-3