Generation and Classification of Illicit Bitcoin Transactions

de Juan Fidalgo, Pablo; Cámara, Carmen; Peris-Lopez, Pedro

doi:10.1007/978-3-031-21333-5_108

Pablo de Juan Fidalgo¹²,
Carmen Cámara¹² &
Pedro Peris-Lopez¹²

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 594))

Included in the following conference series:

International Conference on Ubiquitous Computing and Ambient Intelligence

969 Accesses
1 Citations
1 Altmetric

Abstract

Financial fraud is an everyday problem that banking institutions have to face. With the disruption of Bitcoin as a new model which relies on decentralisation and anonymity, attackers have taken advantage of this monetary system. It allows them to obtain funds from illegal activities such as ransomware payments and hide them. At the same time, Law Enforcement Agencies use open-source data to apply network forensics to Blockchain data. The analysis is usually performed by using artificial intelligence. Unfortunately, the current situation shows a scarcity of high-quality data sets to train the detection algorithms. This work tries to overcome this barrier with significant contributions. With nearly 25,000 illicit transactions, we have increased the Elliptic Data Set –the most extensive labelled transaction data publicly available in any cryptocurrency. The former data set only contained 4,545 illicit transactions, resulting in a class imbalance of 9.8:90.2 illicit/licit ratio. Our work has changed that to a 41.2:58.8 illicit/licit ratio. Besides, to show that class imbalance datasets can also be beaten with artificial work, we have studied the use of generative adversarial networks (GAN) for creating synthetic samples. Finally, the last part of this work was dedicated to applying deep learning and, more particularly, long short-term memory networks (LSTM) for the binary classification problem. We show ideal results that can help change the current state-of-the-art trend, mainly focused on machine learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/PabDJ/IllicitBitcoinTransactions.
2.
https://www.kaggle.com/datasets/pablodejuanfidalgo/augmented-elliptic-data-set.

References

Elliptic: blockchain analytics amp; crypto compliance solutions. https://www.elliptic.co/
Implementation of the keras API, the high-level API of tensorflow. https://www.tensorflow.org/api_docs/python/tf/keras
Alarab, I., Prakoonwit, S., Nacer, M.I.: Comparative analysis using supervised learning methods for anti-money laundering in bitcoin. In: Proceedings of the 2020 5th International Conference on Machine Learning Technologies, pp. 11-17. ICMLT 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3409073.3409078
Bartoletti, M., Pes, B., Serusi, S.: Data mining for detecting bitcoin Ponzi schemes. In: 2018 Crypto Valley Conference on Blockchain Technology (CVCBT), pp. 75–84 (2018). https://doi.org/10.1109/CVCBT.2018.00014
Benzik: Deanonymized 99.5 PCT of elliptic transactions (2019). https://www.kaggle.com/datasets/alexbenzik/deanonymized-995-pct-of-elliptic-transactions
Biryukov, A., Tikhomirov, S.: Deanonymization and linkability of cryptocurrency transactions based on network analysis. In: 2019 IEEE European Symposium on Security and Privacy (EuroSP), pp. 172–184 (2019). https://doi.org/10.1109/EuroSP.2019.00022
Blockstream: esplora HTTP API. https://github.com/Blockstream/esplora/blob/master/API.md
Clemente, F.: How to generate synthetic tabular data? Wasserstein loss for generative adversarial networks (2020). https://towardsdatascience.com/how-to-generate-synthetic-tabular-data-bcde7c28038a
Conti, M., Gangwal, A., Ruj, S.: On the economic significance of ransomware campaigns: a bitcoin transactions perspective. Comput. Secur. 79, 162–189 (2018). https://doi.org/10.1016/j.cose.2018.08.008
Article Google Scholar
Dutta, G.: Fixing imbalance dataset using tGAN (2021). https://www.kaggle.com/code/gauravduttakiit/fixing-imbalance-dataset-using-tgan
Feldman, E.V., Ruchay, A.N., Matveeva, V.K., Samsonova, V.D.: Bitcoin abnormal transaction detection based on machine learning. In: van der Aalst, W.M.P., et al. (eds.) Recent Trends in Analysis of Images, Social Networks and Texts, pp. 205–215. Springer International Publishing, Cham (2021)
Chapter Google Scholar
Foster, D.: Generative Deep Learning. O’Reilly Media, Sebastopol (2019)
Google Scholar
Rebala, G., Ravi, A., Churiwala, S.: An Introduction to Machine Learning. Springer, Cham (2019)
Book MATH Google Scholar
Lorenz, J., Silva, M.I., Aparício, D., Ascensão, J.T., Bizarro, P.: Machine learning methods to detect money laundering in the bitcoin blockchain in the presence of label scarcity. In: Proceedings of the First ACM International Conference on AI in Finance, pp. 1–8 (2020)
Google Scholar
Monamo, P.M., Marivate, V., Twala, B.: A multifaceted approach to bitcoin fraud detection: global and local outliers. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 188–194 (2016). https://doi.org/10.1109/ICMLA.2016.0039
Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system. http://bitcoin.org/bitcoin.pdf (2008)
Nerurkar, P., Bhirud, S., Ludinard, R., Busnel, Y., Kumari, S.: Supervised learning model for identifying illegal activities in bitcoin. Appl. Intell. 51, 1–20 (2021). https://doi.org/10.1007/s10489-020-02048-w
Article Google Scholar
Pandey, A., Bhatt, D.L., Bhowmik, T.: Limitations and applicability of GANs in banking domain. In: ADGN@ECAI (2020)
Google Scholar
Paquet-Clouston, M., Haslhofer, B., Dupont, B.: Ransomware payments in the bitcoin ecosystem. J. Cybersecur. 5(1), tyz003 (2019)
Google Scholar
Ron, D., Shamir, A.: Quantitative analysis of the full bitcoin transaction graph. In: Sadeghi, A.R. (ed.) Financial Cryptography and Data Security, pp. 6–24. Springer, Berlin Heidelberg, Berlin, Heidelberg (2013)
Chapter Google Scholar
van de Voort, J., Coneys, S.: Classifying bitcoin ponzi schemes with machine learning (2018). https://github.com/seanconeys/Bitcoin_Ponzi_ml/blob/master/FinalPaper_PonziClassification.pdf
Weber, M., et al.: anti-money laundering in bitcoin: experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591 (2019)
Wen, Q., et al.: Time series data augmentation for deep learning: a survey. arXiv preprint arXiv:2002.12478 (2020)
Xu, L., Veeramachaneni, K.: Synthesizing tabular data using generative adversarial networks. arXiv preprint arXiv:1811.11264 (2018)
Yazdinejad, A., HaddadPajouh, H., Dehghantanha, A., Parizi, R.M., Srivastava, G., Chen, M.Y.: Cryptocurrency malware hunting: a deep recurrent neural network approach. Appl. Soft. Comput. 96, 106630 (2020) https://doi.org/10.1016/j.asoc.2020.106630, https://www.sciencedirect.com/science/article/pii/S1568494620305688
Zola, F., Segurola-Gil, L., Bruse, J., Galar, M., Orduna-Urrutia, R.: Attacking bitcoin anonymity: generative adversarial networks for improving bitcoin entity classification. Appl. Intell. (2022). https://doi.org/10.1007/s10489-022-03378-7

Download references

Acknowledgment

This work was supported by the Spanish Ministry of Science, Innovation and Universities grant PID2019-111429RBC21(ODIO); and by the Comunidad de Madrid (Spain) under the projects PUCFA (PUCFA-CM-UC3M) and CYNAMON (P2018/TCS-4566)–cofinanced by European Structural Funds (ESF and FEDER).

We want to thank Claudio Bellei, Head of Data Science at Elliptic, for providing us the features related to the illicit transactions that we collected.

Author information

Authors and Affiliations

Universidad Carlos III de Madrid, Madrid, Spain
Pablo de Juan Fidalgo, Carmen Cámara & Pedro Peris-Lopez

Authors

Pablo de Juan Fidalgo
View author publications
You can also search for this author in PubMed Google Scholar
Carmen Cámara
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Peris-Lopez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pablo de Juan Fidalgo .

Editor information

Editors and Affiliations

Castilla-La Mancha University, Ciudad Real, Spain
José Bravo
Computer Science Department, University of Chile, Santiago, Chile
Sergio Ochoa
CICESE, Ensenada, Mexico
Jesús Favela

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de Juan Fidalgo, P., Cámara, C., Peris-Lopez, P. (2023). Generation and Classification of Illicit Bitcoin Transactions. In: Bravo, J., Ochoa, S., Favela, J. (eds) Proceedings of the International Conference on Ubiquitous Computing & Ambient Intelligence (UCAmI 2022). UCAmI 2022. Lecture Notes in Networks and Systems, vol 594. Springer, Cham. https://doi.org/10.1007/978-3-031-21333-5_108

Download citation

DOI: https://doi.org/10.1007/978-3-031-21333-5_108
Published: 21 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21332-8
Online ISBN: 978-3-031-21333-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Generation and Classification of Illicit Bitcoin Transactions