Skip to main content
Log in

Predicting functional roles of Ethereum blockchain addresses

  • Published:
Peer-to-Peer Networking and Applications Aims and scope Submit manuscript

Abstract

Ethereum is one of the largest blockchain programming platforms. Users in Ethereum are identified using public-private key addresses, which are difficult to connect to real-world identities. This has led to a variety of illegal activities being encouraged. However, based on their transactions’ functional roles, these addresses can be linked and identified. In this paper, we proposed a methodology for predicting the functional roles of Ethereum addresses using machine learning. We build machine learning models to predict the functional role of an address based on various features derived from the transactional history over varying window sizes. We have used labeled dataset of 300 million transactions that are publicly available on the Ethereum blockchain. The test data results show that the XGBoost classifier with eleven features vector and 200 window sizes can predict the role of an unseen address with the best achievable accuracy of 73%. We have also trained and tested the deep learning models on the dataset, CNN model predicted the labels with 86% accuracy. Using machine learning models, we have also devised a measure of anonymity and compared it for unlabelled addresses. Further, to qualitatively validate our prediction, we also discovered Ethereum addresses used on the dark web pages and predicted their functional roles with our trained models. Most of these addresses were behaving like Wallet_app, Shapeshift, and Mining and this prediction was aligned with the background information extracted from the context of address usage on the dark web page.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Availability of data and materials

Data will be available on the basis of the request.

Code availability

Code repository of implementations and results available on request.

References

  1. Wood G et al (2014) Ethereum: A secure decentralised generalised transaction ledger. Ethereum Project Yellow Paper 151(2014):1–32

    Google Scholar 

  2. Market C (2020) Coin Market Cap, Top 100 Cryptocurrencies by Market Capitalization. https://coinmarketcap.com/

  3. Bambrough B (2020) Ethereum Is Beating Bitcoin In More Ways Than One. https://www.forbes.com/sites/billybambrough/2020/07/14/ethereum-is-beating-bitcoin-in-more-ways-than-one/#5d0f6a783c69

  4. Butler S (2019) Criminal use of cryptocurrencies: a great new threat or is cash still king? J Cyber Policy 4(3):326–345. https://doi.org/10.1080/23738871.2019.1680720

    Article  Google Scholar 

  5. Foley S, Karlsen JR, Putniņš TJ (2019) Sex, drugs, and bitcoin: How much illegal activity is financed through cryptocurrencies? Rev Financ Stud 32(5):1798–1853

    Article  Google Scholar 

  6. Groysman I (2018) Revolution in Crime: How Cryptocurrencies Have Changed the Criminal Landscape, CUNY Academic Works. https://academicworks.cuny.edu/jj_etds/87

  7. Yeoh P (2019) Banks’ vulnerabilities to money laundering activities. J Money Laund Control

  8. ERAZO F (2020) Bitcoin Activity on the Dark Web Grew by 65% in Q1 2020, Says Study. https://cointelegraph.com/news/bitcoin-activity-on-the-dark-web-grew-by-65-in-q1-2020-says-study. Accessed 12 Jan 2020

  9. Harlev MA, SunYin H, Langenheldt KC, Mukkamala R, Vatrapu R (2018) Breaking bad: De-anonymising entity types on the bitcoin blockchain using supervised machine learning. In: Proceedings of the 51st Hawaii International Conference on System Sciences, pp 3497–3506

  10. Biryukov A, Khovratovich D, Pustogarov I (2014) Deanonymisation of clients in bitcoin p2p network. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. CCS ’14, pp. 15–29. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/2660267.2660379

  11. Santamaria Ortega M (2013) The bitcoin transaction graph anonymity

  12. Spagnuolo M, Maggi F, Zanero S (2014) Bitiodine: Extracting intelligence from the bitcoin network. In: International Conference on Financial Cryptography and Data Security, pp. 457–468. Springer

  13. Klusman R (2018) Deanonymisation in ethereum using existing methods for bitcoin

  14. Béres F, Seres IA, Benczúr AA, Quintyne-Collins M (2020) Blockchain is watching you: Profiling and deanonymizing ethereum users. arXiv preprint arXiv:2005.14051

  15. Victor F (2017) Address clustering heuristics for Ethereum

  16. Abraham J, Higdon D, Nelson J, Ibarra J (2018) Cryptocurrency price prediction using tweet volumes and sentiment analysis. SMU Data Sci Rev 1(3):1

    Google Scholar 

  17. Chen M, Narwal N, Schultz M (2019) Predicting price changes in ethereum. International Journal on Computer Science and Engineering (IJCSE) ISSN, 0975–3397

  18. Kumar D, Rath S (2020) Predicting the trends of price for ethereum using deep learning techniques. In: Artificial Intelligence and Evolutionary Computations in Engineering Systems, pp. 103–114. Springer

  19. Lamon C, Nielsen E, Redondo E (2017) Cryptocurrency price prediction using news and social media sentiment. SMU Data Sci Rev 1(3):1–22

    Google Scholar 

  20. Chen T, Zhu Y, Li Z, Chen J, Li X, Luo X, Lin X, Zhange X (2018) Understanding ethereum via graph analysis. In: IEEE INFOCOM 2018-IEEE Conference on Computer Communications, pp 1484–1492. IEEE

  21. Wil (2019) Ethereum 101, Externally Owned Accounts (EOAs). https://kauri.io/ethereum-101-part-4-accounts-transactions-and-me/7e79b6932f8a41a4bcbbd194fd2fcc3a/a

  22. Zheng Z, Xie S, Dai H, Chen X, Wang H (2017) An overview of blockchain technology: Architecture, consensus, and future trends. In: 2017 IEEE International Congress on Big Data (BigData Congress), pp. 557–564

  23. Wahab A, Mehmood W (2018) Survey of consensus protocols. CoRR abs/1810.03357 arXiv:1810.03357

  24. Qian C, Ouyang K (2015) Predicting bitcoin transactions from blockchain records through recursive clustering

  25. Ron D, Shamir A (2013) Quantitative analysis of the full bitcoin transaction graph. In: International Conference on Financial Cryptography and Data Security, pp 6–24. Springer

  26. Meiklejohn S, Pomarole M, Jordan G, Levchenko K, McCoy D, Voelker GM, Savage S (2013) A fistful of bitcoins: characterizing payments among men with no names. In: Proceedings of the 2013 Conference on Internet Measurement Conference, pp 127–140

  27. Jourdan M, Blandin S, Wynter L, Deshpande P (2018) Characterizing entities in the bitcoin blockchain. In: 2018 IEEE International Conference on Data Mining Workshops (ICDMW), pp 55–62. IEEE

  28. Ermilov D, Panov M, Yanovich Y (2017) Automatic bitcoin address clustering. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 461–466. IEEE

  29. Wu J, Lin D, Zheng Z, Yuan Q (2019) T-edge: Temporal weighted multidigraph embedding for ethereum transaction network analysis. arXiv preprint arXiv:1905.08038

  30. Ferretti S, D’Angelo G (2019) On the ethereum blockchain structure: A complex networks theory perspective. Concurr Comput Pract Exp 5493

  31. Chan W, Olmsted A (2017) Ethereum transaction graph analysis. In: 2017 12th International Conference for Internet Technology and Secured Transactions (ICITST), pp 498–500. IEEE

  32. Higgins S (2016) Gatecoin Hack. https://www.coindesk.com/gatecoin-2-million-bitcoin-ether-security-breach

  33. Chen W, Zheng Z, Cui J, Ngai E, Zheng P, Zhou Y (2018) Detecting ponzi schemes on ethereum: Towards healthier blockchain technology. In: Proceedings of the 2018 World Wide Web Conference, pp 1409–1418

  34. Liu L, Tsai W-T, Bhuiyan MZA, Peng H, Liu M (2021) Blockchain-enabled fraud discovery through abnormal smart contract detection on ethereum. Futur Gener Comput Syst. https://doi.org/10.1016/j.future.2021.08.023

    Article  Google Scholar 

  35. Bartoletti M, Carta S, Cimoli T, Saia R (2020) Dissecting ponzi schemes on ethereum: Identification, analysis, and impact. Futur Gener Comput Syst 102:259–277. https://doi.org/10.1016/j.future.2019.08.014

    Article  Google Scholar 

  36. Payette J, Schwager S, Murphy JW (2017) Characterizing the ethereum address space

  37. Day A (2018) Ethereum Dataset hosted on Google BigQuery. https://cloud.google.com/blog/products/data-analytics/ethereum-bigquery-public-dataset-smart-contract-analytics

  38. Etherscan: Dex label (2020) https://etherscan.io/accounts/label/dex

  39. Etherscan: Exchange label (2020) https://etherscan.io/accounts/label/exchange

  40. Etherscan: Mining label (2020) https://etherscan.io/accounts/label/mining

  41. Etherscan: ICO wallet label (2020) https://etherscan.io/accounts/label/ico-wallets

  42. Etherscan: Walletapp label (2020) https://etherscan.io/accounts/label/wallet-app

  43. Etherscan: Mining label (2020) https://etherscan.io/accounts/label/bitfinex

  44. Etherscan: Compromised label (2020) https://etherscan.io/accounts/label/compromised

  45. Etherscan: Mining label (2020) https://etherscan.io/accounts/label/shapeshift

  46. Etherscan: Mining label (2020) https://etherscan.io/accounts/label/phish-hack

  47. Kastner E (2020) HISTORY OF THE DARK WEB. https://www.soscanhelp.com/blog/history-of-the-dark-web

  48. NIG (2020) Taking on the Dark Web: Law Enforcement Experts ID Investigative Needs. https://nij.ojp.gov/topics/articles/taking-dark-web-law-enforcement-experts-id-investigative-needs

  49. Goodison SE, Woods D, Barnum JD, Kemerer AR, Jackson BA (2019) Identifying law enforcement needs for conducting criminal investigations involving evidence on the dark web

  50. Lee S, Yoon C, Kang H, Kim Y, Kim Y, Han D, Son S, Shin S (2019) Cybercriminal minds: an investigative study of cryptocurrency abuses in the dark web. In: Network and Distributed System Security Symposium, pp 1–15. Internet Society

  51. Daniel: Daniel Page (2020) http://danielas3rtn54uwmofdo3x2bsdifr47huasnmbgqzfrec5ubupvtpid.onion.ly/. Accessed 15 Nov 2020

  52. Tor: Donations for Tor Development (2020) https://onionsearchengine.com/donation.php. Accessed 4 Feb 2021

  53. Tor: Guerrilamail (2020) http://grrmailb3fxpjbwm.onion.ly. Accessed 4 Feb 2021

  54. Etherscan: Word Label Cloud (2020) https://etherscan.io/labelcloud

Download references

Acknowledgements

This research project was partially funded by Blockchain Research Lab at Information Technology University (ITU), Lahore, Pakistan.

Funding

This research project was partially funded by the National Center of Cyber Security (NCCS) Pakistan.

Author information

Authors and Affiliations

Authors

Contributions

Tania Saleem: Conceptualization, Data Curation, Software, Validation, Visualization, Roles/Writing - original draft, Writing - review & editing, Methodology, Investigation, Formal Analysis, Project Administration. Muhammad Ismael: Conceptualization, Methodology, Software, Data Curation, Validation, Visualization, Writing - review & editing. Muhammad Umar Janjua: Conceptualization, Resources, Funding acquisition, Methodology, Writing-review & editing, Supervision, Project Administration. Abdul Rehman Ali: Methodology, Software, Visualization, Writing - review & editing. Awab Aqib: Roles/Writing - original draft, Writing - review & editing, Methodology. Ali Ahmed: Methodology, Supervision, Writing-review & editing. Saeed-ul Hassan: Supervision, Writing-review & editing.

Corresponding author

Correspondence to Tania Saleem.

Ethics declarations

Ethics approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Consent to participate

On the basis of their contribution, all authors agree to participate.

Consent for publication

This publication has been consented to by all authors.

Conflict of interest/Competing interests

The Authors do not have any conflict to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection: 3 - Track on Blockchain

Guest Editor: Haojin Zhu

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Saleem, T., Ismaeel, M., Janjua, M.U. et al. Predicting functional roles of Ethereum blockchain addresses. Peer-to-Peer Netw. Appl. 16, 2985–3002 (2023). https://doi.org/10.1007/s12083-023-01553-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12083-023-01553-2

Keyword

Navigation