Detecting artificial behaviours in the Bitcoin users graph

https://doi.org/10.1016/j.osnem.2017.10.006Get rights and content

Abstract

A unique feature of cryptocurrencies such as Bitcoin is that the blockchain containing all the economic transactions is publicly available. This makes it possible to obtain insights in the behaviour of the users through an analysis of the topological properties of the users graph which is derived from the Bitcoin transaction graph through clustering heuristics. In a previous work, we have analysed the users graph and discovered that the graph is not a small world, due to the presence of outliers in the in-degree frequency distribution of the nodes and of a high diameter, in spite of a small average distance between the nodes of the graph. In this paper, we explain our findings, showing that these structural properties of the network are due to peculiar unusual patterns in the users graph. As a further remark, we argue that these patterns are probably due to artificial users behaviours and not strictly related to normal economic interactions.

Introduction

The boost in the diffusion, during the last years, of Bitcoin [1], the first true digital currency, together with the public availability of its blockchain makes it interesting and feasible to analyse the behaviour of the users of this peculiar economy. Even if Bitcoin still represents a niche economy, it is no longer an experimental currency only for computer science specialists, and has reached a widespread usage. Therefore, the analysis of its blockchain may return interesting insights on the behaviour of the users of a cryptocurrency.

The novelty of the users graph derived from the Bitcoin blockchain is that nodes represent pseudonymous users and a link between nodes represents an economic interaction, i.e. a flow of value between nodes. Interesting analyses of the Bitcoin ecosystem start from an analysis of this graph. Further interesting findings may be obtained through an integration of this analysis with that of further social networks [2]. For instance, analyzing the sentiment of opinions and information distributed on Twitter regarding Bitcoin and comparing with BitcoinâÇÖs price, can be used to check if there is a correlation between Twitter sentiment and BTC price fluctuation. Other works have pointed out the strong correlation between BitcoinâÇÖs price and the search of the “bitcoin” term, calculated by Google Trends [3].

Our previous work [4] analysed several properties of the Bitcoin users graph. In particular, we showed that the graph presents many feature characteristics of the small-world phenomenon, but also some odd behaviours. As a matter of fact, while the average distance between nodes is low, the graph presents a high value of the diameter. This highlights the presence of few pairs of nodes connected by long paths only. Furthermore, the in-degree frequency distribution of the nodes presents some relevant outliers. As power law degree sequences and small diameter are well-established requirements [5] for complex networks in general to assess the small-world and scale-free phenomena, we argue that it is interesting to study these anomalies of our network and the reasons behind them. A possible explanation is that the odd behaviors are caused by users exploiting Bitcoin not only for ordinary transactions, but rather for other activities, like fund management and, possibly, attacks, as stated in our following Conjecture.

Conjecture 1

(a) The indegree frequency distribution anomalies and (b) the high diameter are caused by uncommon artificial users behavior1 rather than being inherent properties of the system.

Our paper supports this conjecture and our analysis show that these behaviours are a consequence of particular anomalous chains of transactions, which we call suspicious transactions. A (α, k)-suspicious transaction is such that the only input address pays all the output addresses, which are at least k, α BTC except at most one, which can be the recipient of an arbitrary amount. We show that special combinations of suspicious transactions, called pseudo-spam transactions, are the sole cause of indegree frequency distribution anomalies and can be further agglomerated in just one cluster each, lowering the diameter length and hence obtaining a much shorter diameter. We remark how these so called suspicious transactions are not “suspicious” by themselves, i.e. they might very well be the result of normal users activity. They were labeled as such because they appeared during both our manual observations presented in Section 3 and because they become interesting when combined together in long chains.

This paper extends our previous work  [6] in several directions:

  • we investigate a further unusual characteristic of the Bitcoin users graph, i.e. the presence of a high value of the graph diameter. We investigate the reason of such anomalous value by showing that it is due to unusual transaction patterns. These transactions share some of the features of the patterns previously detected during the indegree outliers analysis, so we derive from the general definition of suspicious transactions a definition for the transaction patterns that cause the high diameter;

  • we give the Definition of (α, k)-suspicious transaction that enables to unify the description of the different interesting transaction patterns. The entire paper has been consequently restructured;

  • we extend both the related work and the discussion on the economical meaning of the interesting patterns.

The paper is organized as follows. In Section 2 we show some related work. In Section 3 we report our observations. In Section 4, we present interesting transaction schemes, introducing the concept of suspicious-transactions, and we discuss their economical meaning in Section 5. Section 6 shows how these transaction schemes can be used to explain our empirical observations. Finally, Section 7 reports our conclusions and future work.

Section snippets

Related work

Several features of the Bitcoin network have been recently analysed. Most analyses are based on a “transaction graph” built from the blockchain. This graph is the directed hypergraph connecting the set of input addresses of a transaction to all its output addresses.

This can be transformed in the “users graph” through a well established heuristic rule. By applying this rule, all the input addresses of a multi-input transaction are considered as belonging to the same user [1], [7]  (and we say

Our observations

In this section, we present our observations concerning the two main anomalies we want to study: outliers in the indegree frequency distribution and the surprisingly high diameter. Based on the insight obtained from such observations we will give a formalization of interesting transaction chains (and the transactions they are made of) in Section 4. We will then show in Section 6 how these chains are responsible for the anomalies we observed.

Interesting transaction schemes

In this section, we will model interesting transactions and relative structures. We will show in Section 6 how this transaction structures are responsible for the anomalies observed in the previous section.

Let us now introduce the concept of suspicious transaction.

Definition 2

Let A be the set of all addresses present in the blockchain. Given α ∈ R, k ∈ N, and a transaction t modeled as a tuple (In, Out, InAmount, Fees), where:

  • InA;

  • Out is a multiset of couples (o, b) where o ∈ A and b ∈ R, where b

On the economical meaning of our transaction schemes

First we want to clarify the reason behind the naming adopted for the transaction schemes introduced in Section 4 (derived from the observations in Section 3). We decided to use the term “pseudo-spam” to label those transaction models that seemed to have a “spam” effect on the graph measures, i.e. transactions that, even if minoritarian in the dataset, had a macroscopically visible effect on the graph properties, because of how they were formed or because of how they were combined together.

We

Relating transaction schemes to anomalies: experimental evaluation

In this section, we will experimentally show how the transaction structures defined in Section 4 can be used to explain the anomalies observed in Section 3.

Conclusions

This paper investigates the presence of outliers in the indegree frequency distribution and the high diameter we have observed in the Bitcoin users graph. By manually analysing the users graph we have found out that these phenomena are generated by peculiar chains of transactions. We have then formally characterized such chains and automatically studied their impact on the dataset. We have also given possible interpretations of the purpose of the observed chains in the Bitcoin ecosystem. We

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Damiano Di Francesco Maesa received his bachelor degree (cum laude) in theoretic computer science from University of Siena and his matser degree (cum laude) in computer science from University of Pisa. He is currently a Ph.D. student at University of Pisa where he is working on Bitcoin and blockchain technology novel applications. During the three years of his Ph.D. program so far he has published three papers on conference proceedings, one on a refereed international journal and has been a

References (23)

  • S. Nakamoto, Bitcoin: apeer-to-peer electronic cash system,...
  • D. Garcia et al.

    The digital traces of bubbles: feedback cycles between socio-economic signals in the Bitcoin economy

    J. R. Soc. Interfac.

    (2014)
  • M. Matta et al.

    The predictor impact of web search media on Bitcoin trading volumes

    Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)

    (2015)
  • D.D.F. Maesa et al.

    Data-driven analysis of Bitcoin properties: exploiting the users graph

    Int. J. Data Sci. Anal.

    (2017)
  • S. Dommers et al.

    Diameters in preferential attachment models

    J. Stat. Phys.

    (2010)
  • D. Di Francesco Maesa et al.

    An analysis of the Bitcoin users graph: inferring unusual behaviours

    Proceedings of the International Workshop on Complex Networks and their Applications

    (2016)
  • R. Fergal et al.

    An analysis of anonymity in the bitcoin system

    Proceeding of the 2011 PASSAT/SocialCom 2011

    (2011)
  • M. Lischke et al.

    Analyzing the Bitcoin network: the first four years

    Future Internet

    (2016)
  • S. Meiklejohn et al.

    A fistful of Bitcoins: characterizing payments among men with no names

    Proceedings of the 2013 Internet Measurement Conference, IMC

    (2013)
  • D. Kondor et al.

    Do the rich get richer? an empirical analysis of the Bitcoin transaction network

    PloS One

    (2014)
  • E. Androulaki et al.

    Evaluating user privacy in Bitcoin

    Proceedings of the International Conference on Financial Cryptography and Data Security

    (2013)
  • Cited by (43)

    • Analysis of cryptocurrency transactions from a network perspective: An overview

      2021, Journal of Network and Computer Applications
      Citation Excerpt :

      Tracking and observing the transactions of specific addresses can also provide insights into the preferred transaction patterns of these addresses. Maesa et al. (2017) analyzed the outliers in the in-degree distribution of the Bitcoin user network and noticed an unusual kind of transaction pattern called the pseudo-spam transaction. After further analysis, they suspected that the pseudo-spam transactions may be part of a user pseudonymity attack or a spam attack, or may possibly be used for advertising.

    • Exploring and Analyzing the Token Ecosystem: A Complex Network Analysis Perspective

      2023, IEEE Journal on Emerging and Selected Topics in Circuits and Systems
    View all citing articles on Scopus

    Damiano Di Francesco Maesa received his bachelor degree (cum laude) in theoretic computer science from University of Siena and his matser degree (cum laude) in computer science from University of Pisa. He is currently a Ph.D. student at University of Pisa where he is working on Bitcoin and blockchain technology novel applications. During the three years of his Ph.D. program so far he has published three papers on conference proceedings, one on a refereed international journal and has been a reviewer for several papers. He has also held guest lectures and seminars to present blockchain technology, he has co-supervised two bachelor thesis and has been an academic guest at the “Research Group for Distributed Computing (DISCO)” part of “Computer Engineering and Networks Laboratory” at ETH Zürich.

    Dr. Andrea Marino, Assistant Professor at Dipartimento di Informatica, University of Pisa. Ph.D. degree in Computer Science at University of Florence. Research Fellow at University of Milan and Pisa. Author of a book, 25 conference papers, and 10 journal papers on graph algorithms with applications to enumeration, web crawling, bioinformatics, real-world graph analysis, information retrieval, mobile ad hoc networks, and computational linguistic. Co-author of the current best algorithms to list all the paths, cycles, cliques and other popular patterns in graphs. One of the BUbiNG developers, which is the web crawler open-source with the highest performances (downloading, storing, and managing billions of web pages), developed at LAW (Laboratory of Web Algorithmic), University of Milan. Co-author of the current best algorithms to compute exactly: diameter, hyperbolicity, and top-central nodes (closeness centrality) in huge graphs. The diameter algorithm has been used to compute the diameter of the Facebook networks (1.2 billions of nodes). Collaborator of INRIA (Institut National de Recherche en Informatique et en Automatique) BAMBOO & BAOBAB Team, Université Claude Bernard (Lyon1, France), designing ad hoc listing algorithms for metabolic networks and NGS (Next Generation Sequence).

    Laura Ricci received the M. Computer Science from the University of Pisa in 1983 and the Ph.D. from the University of Pisa in 1990. Currently, she is an Assistant Professor at the Department of Computer Science, University of Pisa, Italy. Her research interests include parallel and distributed systems, peer-to-peer networks, cryptocurrencies and blockchains. In this field, she has co-authored over 100 papers in refereed scientific journals and conference proceedings. She has served as a program committee member of several conferences and has been a reviewer for several journals.

    View full text