Skip to main content

A Language-Agnostic Compression Framework for the Bitcoin Blockchain

  • Conference paper
  • First Online:
Advanced Information Networking and Applications (AINA 2024)

Abstract

This research addresses the growing interdisciplinary interest in Bitcoin by proposing a versatile compression framework for transforming raw blockchain data into a streamlined compact format suitable for high-performance analysis. Our approach focuses on developing a language-agnostic API, ensuring accessibility across programming languages. Beyond data extraction, our framework outputs the Bitcoin user transaction graph, facilitating network analysis, forensics, and pattern detection. Processed data are exported to the HDF5 file format for compatibility with mainstream analysis tools. A proof-of-concept CPython implementation demonstrates the framework’s feasibility, showcasing its real-world applicability for data-driven investigations in Bitcoin research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    https://bitcoin.org/bitcoin.pdf.

  2. 2.

    https://blockchain.com/explorer/charts/n-transactions-total.

  3. 3.

    https://en.bitcoin.it/wiki/Common-input-ownership_heuristic.

  4. 4.

    https://github.com/aappleby/smhasher.

  5. 5.

    https://mail.python.org/pipermail/python-dev/2012-December/123028.html.

  6. 6.

    https://courses.csail.mit.edu/6.006/spring11/rec/rec07.pdf.

  7. 7.

    https://blockchain.com/explorer/charts/n-transactions-per-block.

  8. 8.

    https://coinmetrics.io/batching/.

  9. 9.

    https://github.com/alecalve/python-bitcoin-blockchain-parser.

References

  1. Antonopoulos, A.M.: Mastering Bitcoin, , 2 edn., pp. 55–88. O’Reilly Media, Inc. (2017)

    Google Scholar 

  2. Bartoletti, M., Lande, S., Pompianu, L., Bracciali, A.: A general framework for blockchain analytics. In: Proceedings of the 1st Workshop on Scalable and Resilient Infrastructures for Distributed Ledgers, pp. 1–6. Association for Computing Machinery (2017)

    Google Scholar 

  3. Bayliss, J.D.: The Data-Oriented Design Process for Game Development, vol. 55, pp. 31–38. IEEE Computer Society (2022)

    Google Scholar 

  4. Di Francesco, D., Maesa, A.M., Ricci, L.: Data-driven analysis of bitcoin properties: exploiting the users graph. Int. J. Data Sci. Anal. 6, 63–80 (2018)

    Article  Google Scholar 

  5. Drepper, U.: What every programmer should know about memory (2007). https://people.freebsd.org/~lstewart/articles/cpumemory.pdf. Accessed 09 Nov 2023

  6. Folk, M., Heber, G., Koziol, Q., Pourmal, E., Robinson, D.: An overview of the hdf5 technology suite and its applications. In: Proceedings of the EDBT/ICDT 2011 Workshop on Array Databases, pp. 36–47. Association for Computing Machinery (2011)

    Google Scholar 

  7. Kalodner, H., et al.: \(\{\)BlockSci\(\}\): Design and applications of a blockchain analysis platform. In: 29th USENIX Security Symposium (USENIX Security 20), pp. 2721–2738. USENIX Association (2020)

    Google Scholar 

  8. Kowarschik, M., Weiß, C.: An overview of cache optimization techniques and cache-aware numerical algorithms, pp. 213–232. Springer (2003)

    Google Scholar 

  9. Mun, H., Lee, Y.: Bitsql: a sql-based bitcoin analysis system. In: 2022 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pp. 1–8. IEEE (2022)

    Google Scholar 

  10. Ron, D., Shamir, A.: Quantitative analysis of the full bitcoin transaction graph. In: Sadeghi, A.-R. (ed.) FC 2013. LNCS, vol. 7859, pp. 6–24. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39884-1_2

    Chapter  Google Scholar 

  11. Rubin, J.: Btcspark: scalable analysis of the bitcoin blockchain using spark (2015). https://rubin.io/public/pdfs/s897report.pdf. Accessed 19 Nov 2023

  12. Sedgewick, R., Wayne, K.: Algorithms, pp. 216–233. Addison-Wesley Professional, 4 edition (2011)

    Google Scholar 

  13. Robert Endre Tarjan: Efficiency of a good but not linear set union algorithm. J. ACM (JACM) 22(2), 215–225 (1975)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Orestes Papanastassiou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Papanastassiou, O., Thomo, A. (2024). A Language-Agnostic Compression Framework for the Bitcoin Blockchain. In: Barolli, L. (eds) Advanced Information Networking and Applications. AINA 2024. Lecture Notes on Data Engineering and Communications Technologies, vol 200. Springer, Cham. https://doi.org/10.1007/978-3-031-57853-3_20

Download citation

Publish with us

Policies and ethics