Skip to main content
Log in

Code cloning in smart contracts: a case study on verified contracts from the Ethereum blockchain platform

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Ethereum is a blockchain platform that hosts and executes smart contracts. Smart contracts have been used to implement cryptocurrencies and crowdfunding initiatives (ICOs). A major concern in Ethereum is the security of smart contracts. Different from traditional software development, smart contracts are immutable once deployed. Hence, vulnerabilities and bugs in smart contracts can lead to catastrophic financial loses. In order to avoid taking the risk of writing buggy code, smart contract developers are encouraged to reuse pieces of code from reputable sources (e.g., OpenZeppelin). In this paper, we study code cloning in Ethereum. Our goal is to quantify the amount of clones in Ethereum (RQ1), understand key characteristics of clone clusters (RQ2), and determine whether smart contracts contain pieces of code that are identical to those published by OpenZeppelin (RQ3). We applied Deckard, a tree-based clone detector, to all Ethereum contracts for which the source code was available. We observe that developers frequently clone contracts. In particular, 79.2% of the studied contracts are clones and we note an upward trend in the number of cloned contracts per quarter. With regards to the characteristics of clone clusters, we observe that: (i) 9 out of the top-10 largest clone clusters are token managers, (ii) most of the activity of a cluster tends to be concentrated on a few contracts, and (iii) contracts in a cluster to be created by several authors. Finally, we note that the studied contracts have different ratios of code blocks that are identical to those provided by the OpenZeppelin project. Due to the immutability of smart contracts, as well as the impossibility of reverting transactions once they are deemed final, we conclude that the aforementioned findings yield implications to the security, development, and usage of smart contracts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Listing 1
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Listing 2
Fig. 13
Listing 3
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Listing 4
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

Notes

  1. https://libra.org

  2. https://github.com/SAILResearch/suppmaterial-18-masanari-smart_contract_cloning

  3. https://etherscan.io

  4. https://github.com/OpenZeppelin/openzeppelin-solidity

  5. https://etherscan.io/contractsVerified

  6. https://github.com/federicobond/solidity-parser-antlr

  7. https://github.com/skyhover/Deckard

  8. https://github.com/solidityj/solidity-antlr4

  9. https://www.cryptokitties.co

  10. https://github.com/ConsenSys/Tokens

  11. https://github.com/ConsenSys/Tokens/blob/master/contracts/eip20/EIP20.sol

  12. https://github.com/OpenZeppelin/openzeppelin-contracts/issues/1716

  13. https://github.com/OpenZeppelin/openzeppelin-contracts/issues/2006

  14. https://semver.org

  15. https://docs.openzeppelin.com/contracts/2.x/api-stability

  16. https://maven.apache.org

  17. https://www.npmjs.com

  18. http://ethpm.com

  19. https://eos.io

  20. https://poa.network/

  21. https://github.com/melonproject/oyente

  22. A pyramid scheme contract: https://etherscan.io/address/0x09f55c2d116a5833d41ba9208216d11a7cdba4b3#code

  23. https://www.youtube.com/channel/UCpEUyenjL908MFMCO-J_yhw

  24. https://securify.ch

  25. Market capitalization is the multiplication of a company’s shares by its current stock price. In the virtual coin world, a company’s share corresponds to the total value of its coin supply. As of January 07th 2020, Ethereum has a total ether supply of 109,174,249, with a market price of 143.55 USD per ether, yielding a market capitalization of 15.67 billion dollars.

  26. https://en.wikipedia.org/wiki/Initial_public_offering

  27. https://www.mycryptoheroes.net

  28. https://idex.market

  29. Example of a transaction that created a smart contract: https://etherscan.io/tx/0xebcbe706f9959c8b98a72bcd42fed545d3cf60fe3fa801186d5fef2249dac91a

  30. There are tools to help developers flatten Solidity code. An example is truffle-flattener, available at https://www.npmjs.com/package/truffle-flattener.

  31. https://idex.market

References

  • Baker BS (1992) A program for identifying duplicated code Computer Science and Statistics: Proceedings of the 24th Symposium on the Interface, vol 24, pp 49–57

  • Bartoletti M, Carta S, Cimoli T, Saia R (2017) Dissecting ponzi schemes on Ethereum: identification, analysis, and impact, vol abs/1703.03779. arXiv:1703.03779

  • Bellon S, Koschke R, Antoniol G, Krinke J, Merlo E (2007) Comparison and evaluation of clone detection tools. IEEE Trans. Softw. Eng. 33(9):577–591. https://doi.org/10.1109/TSE.2007.70725

    Article  Google Scholar 

  • Bettenburg N, Shang W, Ibrahim WM, Adams B, Zou Y, Hassan AE (2012) An empirical study on inconsistent changes to code clones at the release level. Sci. Comput. Program. 77(6):760–776. https://doi.org/10.1016/j.scico.2010.11.010

    Article  Google Scholar 

  • Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics 5:135–146

    Article  Google Scholar 

  • Ceriani L, Verme P (2012) The origins of the gini index: extracts from variabilità e mutabilità (1912) by corrado gini. J Econ Inequal 10(3):421–443. https://doi.org/10.1007/s10888-011-9188-x

    Article  Google Scholar 

  • Chen W, Zheng Z, Cui J, Ngai E, Zheng P, Zhou Y (2018) Detecting ponzi schemes on Ethereum: Towards healthier blockchain technology. In: Proceedings of the 2018 World Wide Web Conference WWW ’18. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, pp 1409–1418, https://doi.org/10.1145/3178876.3186046, (to appear in print)

  • Cordy JR, Roy CK (2011) The nicad clone detector. In: Proceedings of the 2011 IEEE 19th International Conference on Program Comprehension, IEEE Computer Society ICPC, USA, pp 219–220, https://doi.org/10.1109/ICPC.2011.26, (to appear in print)

  • di Angelo M, Salzer G (2019) A survey of tools for analyzing Ethereum smart contracts. In: 2019 IEEE International Conference on Decentralized Applications and Infrastructures (DAPPCON), pp 69–78

  • Dijkstra EW (1982) On the role of scientific thought. Springer, New York, NY, pp 60–66. https://doi.org/10.1007/978-1-4612-5695-3_12

    Google Scholar 

  • Economist T (2018) Blockchain technology may offer a way to re-decentralise the internet, The Economist Group Limited. [Online; accessed 10-August-2018]

  • Fröwis M, Böhme R (2017) In code we trust?. In: Data Privacy Management, Cryptocurrencies and Blockchain Technology Garcia-Alfaro, J Navarro-Arribas, G Hartenstein, H Herrera-Joancomartí, J. Springer International Publishing, Cham, pp 357–372

  • Gamma E, Helm R, Johnson R, Vlissides J (1995) Design patterns: Elements of reusable object-oriented software. Addison-Wesley Reading, Boston, MA, USA

    MATH  Google Scholar 

  • Gao Z, Jayasundara V, Jiang L, Xia X, Lo D, Grundy J (2019) Smartembed: A tool for clone and bug detection in smart contracts through structural code embedding. In: Proceedings of the 35th International Conference on Software Maintenance and Evolution. ICSME ’19

  • Göde N, Koschke R (2009) Incremental clone detection. In: Proceedings of the 2009 European Conference on Software Maintenance and Reengineering. CSMR’09. IEEE Computer Society, USA, pp 219–228, https://doi.org/10.1109/CSMR.2009.20, (to appear in print)

  • Grishchenko I, Maffei M, Schneidewind C (2018) Foundations and tools for the static analysis of Ethereum smart contracts. In: Computer Aided Verification Chockler, H Weissenbacher, G Springer International Publishing Cham, pp 51–78

  • Hassan AE (2009) Predicting faults using the complexity of code changes. In: Proceedings of the 31st International Conference on Software Engineering. ICSE ’09. IEEE Computer Society, Washington, DC, USA, pp 78–88, https://doi.org/10.1109/ICSE.2009.5070510, (to appear in print)

  • Hindle A, Barr ET, Su Z, Gabel M, Devanbu P (2012) On the naturalness of software. In: Proceedings of the 34th International Conference on Software Engineering. ICSE ’12. IEEE Press, Piscataway, NJ, USA, pp 837–847, http://dl.acm.org/citation.cfm?id=2337223.2337322, (to appear in print)

  • Horwitz J, Huang Z (2018) “CryptoKitties” clones are already popping up in China. [Online; accessed 02-December-2019]

  • Jakobsson M, Juels A (1999) Proofs of work and bread pudding protocols Proceedings of the IFIP TC6/TC11 Joint Working Conference on Secure Information Networks: Communications and Multimedia Security. CMS ’99. http://dl.acm.org/citation.cfm?id=647800.757199. Kluwer, B.V., Deventer, The Netherlands, The Netherlands, pp 258–272

  • Jiang L, Misherghi G, Su Z, Glondu S (2007a) Deckard: Scalable and accurate tree-based detection of code clones. In: Proceedings of the 29th International Conference on Software Engineering. ICSE ’07. IEEE Computer Society, Washington, DC, USA, pp 96–105, https://doi.org/10.1109/ICSE.2007.30, (to appear in print)

  • Jiang L, Su Z, Chiu E (2007b) Context-based detection of clone-related bugs. In: Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering. ESEC-FSE ’07. ACM, New York, NY, USA, pp 55–64, https://doi.org/10.1145/1287624.1287634, (to appear in print)

  • Juergens E, Deissenboeck F, Hummel B, Wagner S (2009) Do code clones matter?. In: Proceedings of the 31st International Conference on Software Engineering. ICSE ’09. IEEE Computer Society, Washington, DC, USA, pp 485–495, https://doi.org/10.1109/ICSE.2009.5070547, (to appear in print)

  • Kalra S, Goel S, Dhawan M, Sharma S (2018) ZEUS: analyzing safety of smart contracts. In: 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18-21, 2018. NDSS ’18. The Internet Society

  • Kaminska I (2017) It’s not just a Ponzi, it’s a ‘smart’ Ponzi. [Online; accessed 26-August-2018]

  • Kamiya T, Kusumoto S, Inoue K (July 2002) Ccfinder: A multilinguistic token-based code clone detection system for large scale source code. IEEE Trans. Softw. Eng. 28(7):654–670. https://doi.org/10.1109/TSE.2002.1019480

    Article  Google Scholar 

  • Kapser CJ, Godfrey MW (2008) “cloning considered harmful” considered harmful: patterns of cloning in software. Empir Softw Eng 13(6):645. https://doi.org/10.1007/s10664-008-9076-6

    Article  Google Scholar 

  • Kim M, Sazawal V, Notkin D, Murphy G (2005) An empirical study of code clone genealogies. In: Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ESEC/FSE-13. ACM, New York, NY, USA, pp 187–196, https://doi.org/10.1145/1081706.1081737, (to appear in print)

  • Koschke R Mens T, Demeyer S (eds) (2008) Identifying and removing software clones, 1st edn. Springer

  • Liu H, Yang Z, Jiang Y, Zhao W, Sun J (2019) Enabling clone detection for Ethereum via smart contract birthmarks. In: Proceedings of the 27th International Conference on Program Comprehension. ICPC ’19. IEEE Press, Piscataway, NJ, USA, pp 105–115, https://doi.org/10.1109/ICPC.2019.00024, (to appear in print)

  • Liu H, Yang Z, Liu C, Jiang Y, Zhao W, Sun J (2018) Eclone: Detect semantic clones in Ethereum via symbolic transaction sketch. In: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ESEC/FSE 2018. ACM, New York, NY, USA, pp 900–903, https://doi.org/10.1145/3236024.3264596, (to appear in print)

  • Lopes CV, Maj P, Martins P, Saini V, Yang D, Zitny J, Sajnani H, Vitek J (October 2017) Déjàvu: A map of code duplicates on github. Proc. ACM Program. Lang. 1(OOPSLA):84:1–84:28. https://doi.org/10.1145/3133908

    Article  Google Scholar 

  • Luu L, Chu D-H, Olickel H, Saxena P, Hobor A (2016) Making smart contracts smarter. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. CCS ’16. ACM, New York, NY, USA, pp 254–269, https://doi.org/10.1145/2976749.2978309, (to appear in print)

  • Mockus A (2007) Large-scale code reuse in open source software. In: Proceedings of the First International Workshop on Emerging Trends in FLOSS Research and Development. FLOSS ’07. IEEE Computer Society, Washington, DC, USA, pp 7–, https://doi.org/10.1109/FLOSS.2007.10, (to appear in print)

  • Popper N (2017) Understanding Ethereum, Bitcoin’s Virtual Cousin, The New York Times. [Online; accessed 10-August-2018]

  • Romano J, Kromrey JD, Coraggio J, Skowronek J (2006) Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’sd for evaluating group differences on the NSSE and other surveys?. In: Annual meeting of the Florida Association of Institutional Research, pp 1–3

  • Roos P (2015) Fast and precise statistical code completion. In: Proceedings of the 37th International Conference on Software Engineering - Volume 2. ICSE ’15. http://dl.acm.org/citation.cfm?id=2819009.2819158. IEEE Press, Piscataway, NJ, USA, pp 757–759

  • Roy CK, Cordy JR, Koschke R (May 2009) Comparison and evaluation of code clone detection techniques and tools: A qualitative approach. Sci. Comput. Program. 74(7):470–495. https://doi.org/10.1016/j.scico.2009.02.007

    Article  MathSciNet  MATH  Google Scholar 

  • Roy CK, Cordy JR (2007) A survey on software clone detection research Technical Report, School of Computing - Queen’s University

  • Sajnani H, Saini V, Svajlenko J, Roy CK, Lopes CV (2016) Sourcerercc: Scaling code clone detection to big-code. In: Proceedings of the 38th International Conference on Software Engineering. ICSE ’16. Association for Computing Machinery, New York, NY, USA, pp 1157–1168, https://doi.org/10.1145/2884781.2884877, (to appear in print)

  • Shannon CE, Weaver W (1963) A mathematical theory of communication. University of Illinois Press, Champaign, IL, USA

    MATH  Google Scholar 

  • Sheneamer A, Kalita J (2016) A survey of software clone detection techniques. International Journal of Computer Applications 137(10):1–21. Published by Foundation of Computer Science (FCS), NY, USA

    Article  Google Scholar 

  • Skvorc B (2018) 15 Alternatives to CryptoKitties You Had No Idea Existed. [Online; accessed 02-December-2019]

  • Swan M (2015) Blockchain: Blueprint for a new economy 1 O’Reilly Media, Inc.

  • Szabo N (1994) Smart Contracts. [Online; accessed 26-August-2018]

  • Tikhomirov S, Voskresenskaya E, Ivanitskiy I, Takhaviev R, Marchenko E, Alexandrov Y (2018) Smartcheck: Static analysis of Ethereum smart contracts. In: Proceedings of the 1st International Workshop on Emerging Trends in Software Engineering for Blockchain. WETSEB ’18. ACM, New York, NY, USA, pp 9–16, https://doi.org/10.1145/3194113.3194115, (to appear in print)

  • Ukkonen E (1992) Approximate string-matching with q-grams and maximal matches. Theor Comput Sci 92(1):191–211. http://www.sciencedirect.com/science/article/pii/0304397592901434

    Article  MathSciNet  Google Scholar 

  • Wahler V, Seipel D, Gudenberg JW, Fischer G (2004) Clone detection in source code by frequent itemset techniques. In: Proceedings of the Source Code Analysis and Manipulation, Fourth IEEE International Workshop. SCAM ’04. IEEE Computer Society, Washington, DC, USA, pp 128–135, https://doi.org/10.1109/SCAM.2004.5, (to appear in print)

  • Wood G (2017) Ethereum: A Secure Decentralised Generalised Transaction Ledger - EIP-150 Revision. [Online; accessed 10-August-2018]

  • Zheng P, Zheng Z, Luo X, Chen X, Liu X (2018) A detailed and real-time performance monitoring framework for blockchain systems. In: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice. ICSE-SEIP ’18. ACM, New York, NY, USA, pp 134–143, https://doi.org/10.1145/3183519.3183546, (to appear in print)

Download references

Acknowledgments

This research has been supported by the Natural Sciences and Engineering Research Council (NSERC), as well as JSPS KAKENHI Japan (Grant Numbers: JP16K12415 and JP19J23477). This study leveraged the computational resources provided by the Microsoft Azure for Research program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Masanari Kondo.

Additional information

Communicated by: Miryung Kim

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Background

In this section, we describe concepts that are key to our study. Sections A.1 defines blockchain. Section A.2 describes Ethereum accounts. Section A.3 introduces smart contracts, including how one deploys, verifies, and executes smart contracts. Finally, Section A.4 defines token, token contracts, and mintable token contracts.

1.1 A.1: Blockchain

A blockchain is a distributed, chronological database of transactions that is shared and maintained across nodes that participate in a peer-to-peer network. Ethereum and Bitcoin are two of the most popular blockchain platforms. As of January 2020, the Ethereum platform holds a remarkable market capitalization of 15.67 billion USD.Footnote 25

Transactions are at the heart of blockchains. The name blockchain comes from the manner in which transactions are stored. More specifically, transactions are packaged into blocks that are linked to one another as a chain. Adding a new transaction to a blockchain requires confirmation from several nodes of the network, which all abide to a certain consensus protocol. Such a protocol is designed to be costly (e.g., in terms of computing power or time) in order to ensure that tampering with the data is infeasible. The Ethereum platform uses the computationally costly Proof-of-Work (PoW) consensus protocol (Jakobsson and Juels 1999), which requires nodes to solve a hard mathematical puzzle. The PoW consensus protocol ensures that there is no better strategy to find the solution to the mathematical puzzle than enumerating the possibilities (i.e., brute force). On the other hand, verification of a solution is trivial and cheap. Ultimately, the PoW consensus protocol ensures that a trustworthy third-party (e.g., a bank) is not needed in order to validate transactions, enabling entities who do not know or trust each other to build a dependable transaction ledger.

Once a block is appended to the blockchain, its contents cannot be altered without changing every other block that came after it. In practice, a transaction is deemed final and irreversible after six block confirmations (i.e., after six new blocks have been added to blockchain). More generally, due to the PoW consensus protocol, it is impossible to change the contents of old blocks without owning more than 50% of the computing power that runs Ethereum.

1.2 A.2: Ethereum Accounts

The Ethereum platform supports two types of accounts: user accounts and smart contract accounts. A user account is very simple in structure. A user account has an address (40-digit hexadecimal ID), a transaction count, and the ETH balance (ETH is the official Ethereum cryptocurrency). A contract account, in turn, holds the bytecode of a smart contract in addition to the previously mentioned fields. By means of a transaction, a user account can transfer ETH to another account, deploy a smart contract (Section A.3.2), or execute a function of a smart contract (Section A.3.4).

1.3 A.3: Smart Contracts

The key difference between Ethereum and Bitcoin is that the former supports smart contracts. The term smart contract was coined by Szabo (1994). According to him, “a smart contract is a computerized transaction protocol that executes the terms of a contract.” More recently, with the advent of Ethereum and other sophisticated blockchain platforms, the concept of smart contracts has become much broader, representing any general-purpose computation. For instance, smart contracts have been used to implement crowdfunding campaigns (e.g., by selling tokens to the public, similarly to an IPOFootnote 26), RPG games (e.g., MyCryptoHeroesFootnote 27), and (crypto)currency trading platforms (e.g., IDEXFootnote 28). Blockchain platforms that support smart contracts are known as programmable blockchains.

1.3.1 A.3.1: Source Code

The source code of a smart contract is written in the Solidity language, whose syntax is similar to that of Java. An illustrative example is shown in Fig. 24. In order to enable the separation of concerns (Dijkstra 1982), the Solidity language provides three key constructs: subcontracts, libraries, and interfaces. When convenient, we indistinctly refer to them as code blocks.

Fig. 24
figure 24

An example of a smart contract written in Solidity

Subcontracts

Subcontracts are similar to classes (as in object-oriented programming). As such, subcontracts typically implement a certain concept (lines 27-51 and 53-65 from the example). Similarly to Java, a subcontract is deemed as abstract when at least one of their functions lacks an implementation. Abstract subcontracts cannot be instantiated, since they are meant to be used as base subcontracts. If a subcontract A inherits from a base subcontract B, then we say that A is a child of B (and that B is a parent of A).

Interfaces

The concept of interfaces comes straight from object-oriented programming. Interfaces are thus similar to abstract subcontracts, but they cannot have any implemented functions (lines 20-25 from the example). In Solidity, subcontracts realize an interface by inheriting from it (line 27 from the example).

Libraries

A library is an isolated piece of code that is meant to be stateless. Libraries often provide a set of utility methods that are mindful of corner-cases or that optimize processing time. For instance, a developer might implement a library that performs mathematical operations without overflows exceptions (lines 4-18 from the example).

1.3.2 A.3.2: Deployment

In Ethereum, a user account can deploy smart contracts. The deployment of a smart contract is done by means of a transactionFootnote 29 that is sent to the blockchain. Such a transaction is commonly referred to as the contract creation transaction. Upon the successful execution of this transaction, the contract is deployed in the blockchain and receives an address. This transaction also records the address of the user account that deployed the contract. This user account is often referred to as the creator (author) of the contract.

1.3.3 A.3.3: Verification

When a user account deploys a smart contract to the Ethereum platform, only the bytecode is stored in the blockchain. Therefore, it is up to the developer to publish the source code of the smart contract. The Etherscan website, which is the primary Ethereum dashboard website, provides a code transparency mechanism known as contract verification. This mechanism offers developers the possibility of publishing the source code of a smart contract on Etherscan, so it becomes available to anyone that is interested in the Ethereum platform. The code verification mechanism works as follows: (i) the developer uploads a flattenedFootnote 30 version of the source code (i.e., a single file containing all the source code) and indicates a particular version of the Solidity compiler, (ii) Etherscan compiles the code using the developer-indicated compiler version, (iii) Etherscan checks if the generated bytecode matches the bytecode that is stored in the blockchain. If there is a perfect match, then the smart contract is deemed as verified and the flattened version of the source code becomes publicly available on Etherscan. We refer to this flattened version of the source code that is published on Etherscan as the code file of a verified contract. A list of verified contracts can be found at https://etherscan.io/contractsVerified.

1.3.4 A.3.4: Execution

In Ethereum, a user account can not only deploy but also execute contracts. A user account executes a smart contract by sending transactions to it. These transactions carry data that specify which function should be executed, as well as data regarding the input parameters of this function. Figure 25 shows an example of a transaction in which the transaction issuer (a user account) transferred tokens to another user account by executing the transfer(address _to, uint256 _value) function of a smart contract.

Fig. 25
figure 25

An example of a smart contract transaction. Image extracted from Etherscan (https://etherscan.io/tx/0xfe742a94a36e348451c7ad99cc74715f3d052b4874242f5f1d52f9cf46c9024f)

1.4 A:4: Cryptocurrency, Tokens, and Coins

A cryptocurrency is a virtual artifact which represents money. A cryptocurrency is native to its own blockchain. In the case of Ethereum, the cryptocurrency is called Ether and is abbreviated as ETH. Ether can be transferred between user accounts. Ether (ETH) is not much different than traditional currencies like USD Dollars (USD) and Euros (EUR). The only practical difference is that we have metal coins and pieces of paper to represent dollars and euros in the physical world. Instead, cryptocurrencies are purely virtual.

Tokens are created on top of existing blockchains. Tokens are used to represent digital assets that are tradeable (and usually fungible), including everything from commodities to voting rights. Every token has a name and an acronym (popularly known as a symbol) and any smart contract can define a new token. It is common for tokens to represent money. Therefore, in practice, (crypto)coins and tokens are frequently used interchangeably. For instance, a crowdfunding initiative that is implemented as a distribution of tokens is more commonly referred to as an ICO (Initial *Coin* Offering) instead of an ITO (Initial *Token* Offering). There are several physical and virtual currency exchanges (e.g., IDEXFootnote 31) around the world that buy and sell (crypto)coins, as well as exchange one (crypto)coin for another.

1.4.1 A.4.1: Token Contract

A token contract is a special kind of smart contract that defines a token and keeps track of its balance across user accounts. Ethereum has two main technical standards for the implementation of tokens, known as the ERC20 and ERC721. The standardization allows contracts to operate on different tokens seamlessly, thus fostering interoperability between smart contracts. From an implementation perspective, ERC20 and ERC721 are object-oriented interfaces defining several functions, such as totalSupply(), balanceOf(address who), and transfer(address to, uint256 value) (check IERC20 in Fig. 24).

1.4.2 A.4.2: Mintable Token and Mintable Token Contact

A mintable token is a special kind of token that has a non-fixed total supply. Most mintable token contracts are ERC20 token contracts with an added mint() function, which increases the total token supply upon invocation. Optionally, a burn() function is also included to decrease the total supply. Bitcoin (BTC), the official cryptocurrency of the homonymous blockchain plaform, is a mintable token. In particular, 12.5 newly created BTCs are given as a reward to those who put the next block on the Bitcoin blockchain platform. Ether (ETH) is not mintable.

Appendix B: Additional Resources

1.1 B.1: Top-10 Largest Clusters: UML Diagram of Each Representative Contract

Fig. 26
figure 26

Representative contract of clone cluster 1. This contract is deployed at the address 0x4672bAD527107471cB5067a887f4656D585a8A31

Fig. 27
figure 27

Representative contract of clone cluster 2. This contract is deployed at the address 0x8912358d977e123b51ecad1ffa0cc4a7e32ff774

Fig. 28
figure 28

Representative contract of clone cluster 3. This contract is deployed at the address 0x30392c252da07b69194972e9f770b6dd5deb7af8

Fig. 29
figure 29

Representative contract of clone cluster 4. This contract is deployed at the address 0xa7d7609766d7fcfaf38eda123454bf94b1c1abf0

Fig. 30
figure 30

Representative contract of clone cluster 5. This contract is deployed at the address 0x9064c91e51d7021a85ad96817e1432abf6624470

Fig. 31
figure 31

Representative contract of clone cluster 6. This contract is deployed at the address 0x47a892bf7336a120ee69 b2db6acb552acad5f46d

Fig. 32
figure 32

Representative contract of clone cluster 7. This contract is deployed at the address 0x1d8e5dcf365864fcda8ab39 d1f60d66ee2b82770

Fig. 33
figure 33

Representative contract of clone cluster 8. This contract is deployed at the address 0x9a642d6b3368ddc662ca244badf32cda716005bc

Fig. 34
figure 34

Representative contract of clone cluster 9. This contract is deployed at the address 0x2eb86e8fc520e0f6bb5d9af08f924fe70558ab89

Fig. 35
figure 35

Representative contract of clone cluster 10. This contract is deployed at the address 0x0a2eaa1101bfec3844d9f79dd4e5b2f2d5b1fd4d

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kondo, M., Oliva, G.A., Jiang, Z.M.(. et al. Code cloning in smart contracts: a case study on verified contracts from the Ethereum blockchain platform. Empir Software Eng 25, 4617–4675 (2020). https://doi.org/10.1007/s10664-020-09852-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-020-09852-5

Keywords

Navigation