Abstract
Financial fraud is an everyday problem that banking institutions have to face. With the disruption of Bitcoin as a new model which relies on decentralisation and anonymity, attackers have taken advantage of this monetary system. It allows them to obtain funds from illegal activities such as ransomware payments and hide them. At the same time, Law Enforcement Agencies use open-source data to apply network forensics to Blockchain data. The analysis is usually performed by using artificial intelligence. Unfortunately, the current situation shows a scarcity of high-quality data sets to train the detection algorithms. This work tries to overcome this barrier with significant contributions. With nearly 25,000 illicit transactions, we have increased the Elliptic Data Set –the most extensive labelled transaction data publicly available in any cryptocurrency. The former data set only contained 4,545 illicit transactions, resulting in a class imbalance of 9.8:90.2 illicit/licit ratio. Our work has changed that to a 41.2:58.8 illicit/licit ratio. Besides, to show that class imbalance datasets can also be beaten with artificial work, we have studied the use of generative adversarial networks (GAN) for creating synthetic samples. Finally, the last part of this work was dedicated to applying deep learning and, more particularly, long short-term memory networks (LSTM) for the binary classification problem. We show ideal results that can help change the current state-of-the-art trend, mainly focused on machine learning algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
https://github.com/PabDJ/IllicitBitcoinTransactions.
- 2.
https://www.kaggle.com/datasets/pablodejuanfidalgo/augmented-elliptic-data-set.
References
Elliptic: blockchain analytics amp; crypto compliance solutions. https://www.elliptic.co/
Implementation of the keras API, the high-level API of tensorflow. https://www.tensorflow.org/api_docs/python/tf/keras
Alarab, I., Prakoonwit, S., Nacer, M.I.: Comparative analysis using supervised learning methods for anti-money laundering in bitcoin. In: Proceedings of the 2020 5th International Conference on Machine Learning Technologies, pp. 11-17. ICMLT 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3409073.3409078
Bartoletti, M., Pes, B., Serusi, S.: Data mining for detecting bitcoin Ponzi schemes. In: 2018 Crypto Valley Conference on Blockchain Technology (CVCBT), pp. 75–84 (2018). https://doi.org/10.1109/CVCBT.2018.00014
Benzik: Deanonymized 99.5 PCT of elliptic transactions (2019). https://www.kaggle.com/datasets/alexbenzik/deanonymized-995-pct-of-elliptic-transactions
Biryukov, A., Tikhomirov, S.: Deanonymization and linkability of cryptocurrency transactions based on network analysis. In: 2019 IEEE European Symposium on Security and Privacy (EuroSP), pp. 172–184 (2019). https://doi.org/10.1109/EuroSP.2019.00022
Blockstream: esplora HTTP API. https://github.com/Blockstream/esplora/blob/master/API.md
Clemente, F.: How to generate synthetic tabular data? Wasserstein loss for generative adversarial networks (2020). https://towardsdatascience.com/how-to-generate-synthetic-tabular-data-bcde7c28038a
Conti, M., Gangwal, A., Ruj, S.: On the economic significance of ransomware campaigns: a bitcoin transactions perspective. Comput. Secur. 79, 162–189 (2018). https://doi.org/10.1016/j.cose.2018.08.008
Dutta, G.: Fixing imbalance dataset using tGAN (2021). https://www.kaggle.com/code/gauravduttakiit/fixing-imbalance-dataset-using-tgan
Feldman, E.V., Ruchay, A.N., Matveeva, V.K., Samsonova, V.D.: Bitcoin abnormal transaction detection based on machine learning. In: van der Aalst, W.M.P., et al. (eds.) Recent Trends in Analysis of Images, Social Networks and Texts, pp. 205–215. Springer International Publishing, Cham (2021)
Foster, D.: Generative Deep Learning. O’Reilly Media, Sebastopol (2019)
Rebala, G., Ravi, A., Churiwala, S.: An Introduction to Machine Learning. Springer, Cham (2019)
Lorenz, J., Silva, M.I., Aparício, D., Ascensão, J.T., Bizarro, P.: Machine learning methods to detect money laundering in the bitcoin blockchain in the presence of label scarcity. In: Proceedings of the First ACM International Conference on AI in Finance, pp. 1–8 (2020)
Monamo, P.M., Marivate, V., Twala, B.: A multifaceted approach to bitcoin fraud detection: global and local outliers. In: 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 188–194 (2016). https://doi.org/10.1109/ICMLA.2016.0039
Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system. http://bitcoin.org/bitcoin.pdf (2008)
Nerurkar, P., Bhirud, S., Ludinard, R., Busnel, Y., Kumari, S.: Supervised learning model for identifying illegal activities in bitcoin. Appl. Intell. 51, 1–20 (2021). https://doi.org/10.1007/s10489-020-02048-w
Pandey, A., Bhatt, D.L., Bhowmik, T.: Limitations and applicability of GANs in banking domain. In: ADGN@ECAI (2020)
Paquet-Clouston, M., Haslhofer, B., Dupont, B.: Ransomware payments in the bitcoin ecosystem. J. Cybersecur. 5(1), tyz003 (2019)
Ron, D., Shamir, A.: Quantitative analysis of the full bitcoin transaction graph. In: Sadeghi, A.R. (ed.) Financial Cryptography and Data Security, pp. 6–24. Springer, Berlin Heidelberg, Berlin, Heidelberg (2013)
van de Voort, J., Coneys, S.: Classifying bitcoin ponzi schemes with machine learning (2018). https://github.com/seanconeys/Bitcoin_Ponzi_ml/blob/master/FinalPaper_PonziClassification.pdf
Weber, M., et al.: anti-money laundering in bitcoin: experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591 (2019)
Wen, Q., et al.: Time series data augmentation for deep learning: a survey. arXiv preprint arXiv:2002.12478 (2020)
Xu, L., Veeramachaneni, K.: Synthesizing tabular data using generative adversarial networks. arXiv preprint arXiv:1811.11264 (2018)
Yazdinejad, A., HaddadPajouh, H., Dehghantanha, A., Parizi, R.M., Srivastava, G., Chen, M.Y.: Cryptocurrency malware hunting: a deep recurrent neural network approach. Appl. Soft. Comput. 96, 106630 (2020) https://doi.org/10.1016/j.asoc.2020.106630, https://www.sciencedirect.com/science/article/pii/S1568494620305688
Zola, F., Segurola-Gil, L., Bruse, J., Galar, M., Orduna-Urrutia, R.: Attacking bitcoin anonymity: generative adversarial networks for improving bitcoin entity classification. Appl. Intell. (2022). https://doi.org/10.1007/s10489-022-03378-7
Acknowledgment
This work was supported by the Spanish Ministry of Science, Innovation and Universities grant PID2019-111429RBC21(ODIO); and by the Comunidad de Madrid (Spain) under the projects PUCFA (PUCFA-CM-UC3M) and CYNAMON (P2018/TCS-4566)–cofinanced by European Structural Funds (ESF and FEDER).
We want to thank Claudio Bellei, Head of Data Science at Elliptic, for providing us the features related to the illicit transactions that we collected.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
de Juan Fidalgo, P., Cámara, C., Peris-Lopez, P. (2023). Generation and Classification of Illicit Bitcoin Transactions. In: Bravo, J., Ochoa, S., Favela, J. (eds) Proceedings of the International Conference on Ubiquitous Computing & Ambient Intelligence (UCAmI 2022). UCAmI 2022. Lecture Notes in Networks and Systems, vol 594. Springer, Cham. https://doi.org/10.1007/978-3-031-21333-5_108
Download citation
DOI: https://doi.org/10.1007/978-3-031-21333-5_108
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21332-8
Online ISBN: 978-3-031-21333-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)