Skip to main content

Divide & Scale: Formalization and Roadmap to Robust Sharding

  • Conference paper
  • First Online:
Structural Information and Communication Complexity (SIROCCO 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13892))

Abstract

Sharding distributed ledgers is a promising on-chain solution for scaling blockchains but lacks formal grounds, nurturing skepticism on whether such complex systems can scale blockchains securely. We fill this gap by introducing the first formal framework as well as a roadmap to robust sharding. In particular, we first define the properties sharded distributed ledgers should fulfill. We build upon and extend the Bitcoin backbone protocol by defining consistency and scalability. Consistency encompasses the need for atomic execution of cross-shard transactions to preserve safety, whereas scalability encapsulates the speedup a sharded system can gain in comparison to a non-sharded system.

Using our model, we explore the limitations of sharding. We show that a sharded ledger with n participants cannot scale under a fully adaptive adversary, but it can scale up to m shards where \(n=c'm\log m\), under an epoch-adaptive adversary; the constant \(c'\) encompasses the trade-off between security and scalability. This is possible only if the sharded ledgers create succinct proofs of the valid state updates at every epoch. We leverage our results to identify the sufficient components for robust sharding, which we incorporate in a protocol abstraction termed Divide & Scale. To demonstrate the power of our framework, we analyze the most prominent sharded blockchains (Elastico, Monoxide, OmniLedger, RapidChain) and pinpoint where they fail to meet the desired properties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Validity depends on the application using the ledger.

  2. 2.

    Only one of the ledgers is actually committed as part of the shard’s ledger, but before commitment there are multiple potential ledgers.

  3. 3.

    To scale in bandwidth, the block size cannot depend on the parties or transactions.

  4. 4.

    Without loss of generality, we assume all transactions are valid and thus are eventually included in all honest parties’ ledgers.

  5. 5.

    Due to their inherent inability to asymptotically scale, we believe uni-consensus systems are categorized as performance optimizations of consensus, e.g., [5, 7, 19, 49, 50].

  6. 6.

    The final committee in Elastico broadcasts only the Merkle root for each block. However, this is asymptotically equivalent to including all transactions since the block size is constant. Furthermore, the final committee does not check if the received transactions are conflicting but merely verifies the presence of signatures.

  7. 7.

    Note that if v is constant, a more elaborate analysis could yield a lower upper bound on \(m'\) better than m/v (depending on \(D_T\)). However, if v is not constant but approximates the number of shards m, then \(m'\) is also bounded by the scalability of the Atomix protocol (Lemma 39), and thus the throughput factor can be much lower.

References

  1. Bitcoin statistics on transaction utxos. https://bitcoinvisuals.com/. Accessed 20 Nov 2020

  2. Al-Bassam, M.: LazyLedger: a distributed data availability ledger with client-side smart contracts. arXiv preprint arXiv:1905.09274 (2019)

  3. Al-Bassam, M., Sonnino, A., Bano, S., Hrycyszyn, D., Danezis, G.: Chainspace: a sharded smart contracts platform. In: 25th Annual Network and Distributed System Security Symposium (2018)

    Google Scholar 

  4. Al-Bassam, M., Sonnino, A., Buterin, V.: Fraud and data availability proofs: maximising light client security and scaling blockchains with dishonest majorities. arXiv preprint arXiv:1809.09044 (2018)

  5. Androulaki, E., et al.: Hyperledger fabric: a distributed operating system for permissioned blockchains. In: Proceedings of the 13th EuroSys Conference, pp. 30:1–30:15 (2018)

    Google Scholar 

  6. Androulaki, E., Cachin, C., De Caro, A., Kokoris-Kogias, E.: Channels: horizontal scaling and confidentiality on permissioned blockchains. In: Lopez, J., Zhou, J., Soriano, M. (eds.) ESORICS 2018. LNCS, vol. 11098, pp. 111–131. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99073-6_6

    Chapter  Google Scholar 

  7. Avarikioti, Z., Heimbach, L., Schmid, R., Wattenhofer, R.: FnF-BFT: exploring performance limits of BFT protocols. arXiv preprint arXiv:2009.02235 (2020)

  8. Bano, S., et al.: SoK: consensus in the age of blockchains. In: Proceedings of the 1st ACM Conference on Advances in Financial Technologies, pp. 183–198. ACM (2019)

    Google Scholar 

  9. Ben-Sasson, E.: A cambrian explosion of crypto proofs (2020). https://nakamoto.com/cambrian-explosion-of-crypto-proofs/

  10. Boneh, D., Bünz, B., Fisch, B.: Batching techniques for accumulators with applications to IOPs and stateless blockchains. In: Boldyreva, A., Micciancio, D. (eds.) CRYPTO 2019. LNCS, vol. 11692, pp. 561–586. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-26948-7_20

    Chapter  Google Scholar 

  11. Bonneau, J., Clark, J., Goldfeder, S.: On bitcoin as a public randomness source. IACR Cryptology ePrint Archive, Report 2015/1015 (2015)

    Google Scholar 

  12. Borge, M., Kokoris-Kogias, E., Jovanovic, P., Gasser, L., Gailly, N., Ford, B.: Proof-of-personhood: redemocratizing permissionless cryptocurrencies. In: IEEE European Symposium on Security and Privacy Workshops, pp. 23–26 (2017)

    Google Scholar 

  13. Bracha, G., Toueg, S.: Asynchronous consensus and broadcast protocols. J. ACM 32(4), 824–840 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  14. Bünz, B., Goldfeder, S., Bonneau, J.: Proofs-of-delay and randomness beacons in ethereum. In: IEEE Security and Privacy on the Blockchain (2017)

    Google Scholar 

  15. Bünz, B., Kiffer, L., Luu, L., Zamani, M.: FlyClient: super-light clients for cryptocurrencies. In: IEEE Symposium on Security and Privacy, pp. 928–946 (2020)

    Google Scholar 

  16. Cascudo, I., David, B.: SCRAPE: scalable randomness attested by public entities. In: Gollmann, D., Miyaji, A., Kikuchi, H. (eds.) ACNS 2017. LNCS, vol. 10355, pp. 537–556. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-61204-1_27

    Chapter  Google Scholar 

  17. Castro, M., Liskov, B.: Practical byzantine fault tolerance. In: Proceedings of the 3rd USENIX Symposium on Operating Systems Design and Implementation, pp. 173–186 (1999)

    Google Scholar 

  18. Chatzigiannis, P., Baldimtsi, F., Chalkias, K.: SoK: blockchain light clients. In: Eyal, I., Garay, J. (eds.) FC 2022. LNCS, vol. 13411, pp. 615–641. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-18283-9_31

    Chapter  Google Scholar 

  19. Danezis, G., Kokoris-Kogias, L., Sonnino, A., Spiegelman, A.: Narwhal and tusk: a DAG-based mempool and efficient BFT consensus. In: Proceedings of the Seventeenth European Conference on Computer Systems, pp. 34–50 (2022)

    Google Scholar 

  20. Das, S., Yurek, T., Xiang, Z., Miller, A., Kokoris-Kogias, L., Ren, L.: Practical asynchronous distributed key generation. In: 2022 IEEE Symposium on Security and Privacy (SP), pp. 2518–2534. IEEE (2022)

    Google Scholar 

  21. David, B., Gaži, P., Kiayias, A., Russell, A.: Ouroboros praos: an adaptively-secure, semi-synchronous proof-of-stake blockchain. In: Nielsen, J.B., Rijmen, V. (eds.) EUROCRYPT 2018. LNCS, vol. 10821, pp. 66–98. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78375-8_3

    Chapter  Google Scholar 

  22. Feldman, P.: A practical scheme for non-interactive verifiable secret sharing. In: 28th Annual IEEE Symposium on Foundations of Computer Science, pp. 427–438. IEEE (1987)

    Google Scholar 

  23. Fisher, T., Funk, D., Sams, R.: The birthday problem and generalizations. Carlton College, Mathematics Comps Gala (2013). https://d31kydh6n6r5j5.cloudfront.net/uploads/sites/66/2019/04/birthday_comps.pdf

  24. Garay, J., Kiayias, A., Leonardos, N.: The bitcoin backbone protocol: analysis and applications. In: Oswald, E., Fischlin, M. (eds.) EUROCRYPT 2015. LNCS, vol. 9057, pp. 281–310. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-46803-6_10

    Chapter  Google Scholar 

  25. Gilad, Y., Hemo, R., Micali, S., Vlachos, G., Zeldovich, N.: Algorand: scaling byzantine agreements for cryptocurrencies. In: Proceedings of the 26th Symposium on Operating Systems Principles, pp. 51–68. ACM (2017)

    Google Scholar 

  26. Kadhe, S., Chung, J., Ramchandran, K.: SeF: a secure fountain architecture for slashing storage costs in blockchains. arXiv preprint arXiv:1906.12140 (2019)

  27. Kattis, A., Bonneau, J.: Proof of necessary work: succinct state verification with fairness guarantees. IACR Cryptology ePrint Archive, Report 2020/190 (2020)

    Google Scholar 

  28. Kiayias, A., Miller, A., Zindros, D.: Non-interactive proofs of proof-of-work. In: Bonneau, J., Heninger, N. (eds.) FC 2020. LNCS, vol. 12059, pp. 505–522. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-51280-4_27

    Chapter  Google Scholar 

  29. Kiayias, A., Panagiotakos, G.: On trees, chains and fast transactions in the blockchain. In: Lange, T., Dunkelman, O. (eds.) LATINCRYPT 2017. LNCS, vol. 11368, pp. 327–351. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25283-0_18

    Chapter  MATH  Google Scholar 

  30. Kiayias, A., Russell, A., David, B., Oliynykov, R.: Ouroboros: a provably secure proof-of-stake blockchain protocol. In: Katz, J., Shacham, H. (eds.) CRYPTO 2017. LNCS, vol. 10401, pp. 357–388. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63688-7_12

    Chapter  Google Scholar 

  31. Kogias, E.K., Jovanovic, P., Gailly, N., Khoffi, I., Gasser, L., Ford, B.: Enhancing bitcoin security and performance with strong consistency via collective signing. In: 25th USENIX Security Symposium, pp. 279–296 (2016)

    Google Scholar 

  32. Kokoris-Kogias, E.: Robust and scalable consensus for sharded distributed ledgers. IACR Cryptology ePrint Archive, Report 2019/676 (2019)

    Google Scholar 

  33. Kokoris-Kogias, E., Jovanovic, P., Gasser, L., Gailly, N., Syta, E., Ford, B.: OmniLedger: a secure, scale-out, decentralized ledger via sharding. In: 39th IEEE Symposium on Security and Privacy, pp. 583–598. IEEE (2018)

    Google Scholar 

  34. Kokoris-Kogias, E., Malkhi, D., Spiegelman, A.: Asynchronous distributed key generation for computationally-secure randomness, consensus, and threshold signatures. In: 27th ACM SIGSAC Conference on Computer and Communications Security, pp. 1751–1767. ACM (2020)

    Google Scholar 

  35. Lamport, L., Shostak, R., Pease, M.: The byzantine generals problem. In: Concurrency: The Works of Leslie Lamport, pp. 203–226 (2019)

    Google Scholar 

  36. Luu, L., Narayanan, V., Zheng, C., Baweja, K., Gilbert, S., Saxena, P.: A secure sharding protocol for open blockchains. In: Proceedings of the 25th ACM SIGSAC Conference on Computer and Communications Security, pp. 17–30. ACM (2016)

    Google Scholar 

  37. Martino, W., Quaintance, M., Popejoy, S.: Chainweb: a proof-of-work parallel-chain architecture for massive throughput (2018). https://www.kadena.io/whitepapers

  38. Maymounkov, P., Mazières, D.: Kademlia: a peer-to-peer information system based on the XOR metric. In: Druschel, P., Kaashoek, F., Rowstron, A. (eds.) IPTPS 2002. LNCS, vol. 2429, pp. 53–65. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45748-8_5

    Chapter  MATH  Google Scholar 

  39. Meckler, I., Shapiro, E.: Coda: decentralized cryptocurrency at scale (2018). https://cdn.codaprotocol.com/static/coda-whitepaper-05-10-2018-0.pdf

  40. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008)

    Google Scholar 

  41. Pass, R., Seeman, L., Shelat, A.: Analysis of the blockchain protocol in asynchronous networks. In: Coron, J.-S., Nielsen, J.B. (eds.) EUROCRYPT 2017. LNCS, vol. 10211, pp. 643–673. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56614-6_22

    Chapter  MATH  Google Scholar 

  42. Raab, M., Steger, A.: “Balls into bins’’—a simple and tight analysis. In: Luby, M., Rolim, J.D.P., Serna, M. (eds.) RANDOM 1998. LNCS, vol. 1518, pp. 159–170. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49543-6_13

    Chapter  MATH  Google Scholar 

  43. Rana, R., Kannan, S., Tse, D., Viswanath, P.: Free2shard: adaptive-adversary-resistant sharding via dynamic self allocation. arXiv preprint arXiv:2005.09610 (2020)

  44. Ren, L., Nayak, K., Abraham, I., Devadas, S.: Practical synchronous byzantine consensus. arXiv preprint arXiv:1704.02397 (2017)

  45. Schindler, P., Judmayer, A., Stifter, N., Weippl, E.: HydRand: practical continuous distributed randomness. IACR Cryptology ePrint Archive, Report 2018/319 (2018)

    Google Scholar 

  46. Sen, S., Freedman, M.J.: Commensal cuckoo: secure group partitioning for large-scale services. ACM SIGOPS Oper. Syst. Rev. 46(1), 33–39 (2012)

    Article  Google Scholar 

  47. Sompolinsky, Y., Zohar, A.: Secure high-rate transaction processing in bitcoin. In: Böhme, R., Okamoto, T. (eds.) FC 2015. LNCS, vol. 8975, pp. 507–527. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-47854-7_32

    Chapter  Google Scholar 

  48. Sonnino, A., Bano, S., Al-Bassam, M., Danezis, G.: Replay attacks and defenses against cross-shard consensus in sharded distributed ledgers. arXiv preprint arXiv:1901.11218 (2019)

  49. Spiegelman, A., Giridharan, N., Sonnino, A., Kokoris-Kogias, L.: Bullshark: DAG BFT protocols made practical. In: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pp. 2705–2718 (2022)

    Google Scholar 

  50. Stathakopoulou, C., David, T., Vukolić, M.: Mir-BFT: High-throughput BFT for blockchains. arXiv preprint arXiv:1906.05552 (2019)

  51. Syta, E., et al.: Scalable bias-resistant distributed randomness. In: IEEE Symposium on Security and Privacy, pp. 444–460 (2017)

    Google Scholar 

  52. Wang, G., Shi, Z.J., Nixon, M., Han, S.: SoK: Sharding on blockchain. In: Proceedings of the 1st ACM Conference on Advances in Financial Technologies, pp. 41–61. ACM (2019)

    Google Scholar 

  53. Wang, J., Wang, H.: Monoxide: scale out blockchains with asynchronous consensus zones. In: 16th USENIX Symposium on Networked Systems Design and Implementation, pp. 95–112 (2019)

    Google Scholar 

  54. Yin, M., Malkhi, D., Reiter, M.K., Gueta, G.G., Abraham, I.: HotStuff: BFT consensus with linearity and responsiveness. In: Proceedings of the 38th ACM Symposium on Principles of Distributed Computing, pp. 347–356. ACM (2019)

    Google Scholar 

  55. Zamani, M., Movahedi, M., Raykova, M.: RapidChain: scaling blockchain via full sharding. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 931–948. ACM (2018)

    Google Scholar 

  56. Zamyatin, A., et al.: SoK: communication across distributed ledgers. IACR Cryptology ePrint Archive, Report 2019/1128 (2019)

    Google Scholar 

Download references

Acknowledgments

The work was partially supported by the Austrian Science Fund (FWF) through the project CoRaF (grant agreement 2020388).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zeta Avarikioti .

Editor information

Editors and Affiliations

Appendices

A Limitations of Sharding Protocols

1.1 A.1 General Bounds

First, we prove there is no robust sharded transaction ledger that has a constant number of shards. Then, we show that there is no protocol that maintains a robust sharded transaction ledger against an adaptive adversary.

Lemma 14

In any robust sharded transaction ledger the number of shards

(parametrized by n) is \(m=\omega (1)\).

Proof

Suppose there is a protocol that maintains a constant number m of sharded ledgers, denoted by \(x_1, x_2, \dots , x_m\). Let n denote the number of parties and T the number of transactions to be processed (wlog assumed to be valid). A transaction is processed only if it is stable, i.e. is included deep enough in a ledger (k blocks from the end of the ledger where k a security parameter). Each ledger will include T/m transactions on expectation. Now suppose each party participates in only one ledger (best case), thus broadcasts, verifies, and stores the transactions of that ledger only. Hence, every party stores T/m transactions on expectation. The expected space factor is \(\omega _s=\sum _{\forall i \in [n]} \sum _{\forall x \in L_i} |x^{\lceil k}| / (n|T|) = \sum _{\forall x \in L_i} \frac{T}{nmT} = \frac{n}{nm} = \varTheta (\frac{1}{m})=\varTheta (1)\), when m in constant. Thus, scalability is not satisfied.

Suppose a party is participating in shard \(x_i\). If the party maintains information (e.g. the headers of the chain for verification purposes) on the chain of shard \(x_j\), we say that the party is a light node for shard \(x_j\). In particular, a light node for shard \(x_j\) maintains information at least proportional to the length of the shard’s chain \(x_j\). This holds because blocks must be of constant size to be able to scale in bandwidth (aka communication), and thus storing all the headers of a shard is asymptotically similar in overhead to storing the entire shard with the block content. Sublinear light clients [15, 28] verifiably compact the shard’s state, thus are not considered light nodes but are discussed later. We next prove that if parties act as light clients to all shards involved in cross-shard transactions, then the sharded ledger can scale only if each shard does not interact with all the other shards (or a constant fraction thereof).

Lemma 15

For any robust sharded transaction ledger that requires every participant to be a light node for all the shards affected by cross-shard transactions, it holds \(\mathbb {E}(\gamma )=o(m)\).

Proof

We assumed that every ledger interacts on average with \(\gamma \) different ledgers, i. e., the cross-shard transactions involve \(\gamma \) many different shards on expectation. The block size is considered constant, meaning each block includes at most e transactions where e is constant. Thus, each party maintaining a ledger and being a light node to \(\gamma \) other ledgers must store on expectation \((1+\frac{\gamma }{e})\frac{T}{m}\) information. Hence, the expected space factor is

$$\mathbb {E}(\omega _s)= \sum _{\forall i \in [n]} \sum _{\forall x \in L_i} |x^{\lceil k}| / (n|T|) = n \frac{(1+\frac{\gamma }{e})\frac{T}{m}}{nT}= \varTheta \Big (\frac{\gamma }{m}\Big )$$

where the second equation holds due to linearity of expectation. To satisfy scalability, we demand \(\mathbb {E}(\omega _s)=o(1)\), thus \(\gamma =o(m)\).

Next, we show that there is no protocol that maintains a robust transaction ledger against an adaptive adversary in our model. We highlight that our result holds because we assume any node is corruptible by the adversary. If we assume more restrictive corruption sets, e.g. each shard has at least one honest well-connected node, sharding against an adaptive adversary may be possible if we employ other tools, such as fraud and data availability proofs [4].

Theorem 10

There is no protocol maintaining a robust sharded transaction ledger against an adaptive adversary in our model controlling \(f \ge n/m\), where m is the number of shards, and n is the number of parties.

Proof

(Towards contradiction). Suppose there exists a protocol \(\varPi \) that maintains a robust sharded ledger against an adaptive adversary that corrupts \(f=n/m\) parties. From the pigeonhole principle, there exists at least one shard \(x_i\) with at most n/m parties (independent of how shards are created). The adversary is adaptive, hence at any round can corrupt all parties of shard \(x_i\). In a malicious shard, the adversary can perform arbitrary operations, thus can spend the same UTXO in multiple cross-shard transactions. However, for a cross-shard transaction to be executed it needs to be accepted by the output shard, which is honest. Now, suppose \(\varPi \) allows the parties of each shard to verify the ledger of another shard. For Lemma 15 to hold, the verification process can affect at most o(m) shards. Note that even a probabilistic verification, i. e., randomly select some transactions to verify, can fail due to storage requirements and the fact that the adversary can perform arbitrarily many attacks. Therefore, for each shard, there are at least 2 different shards that do not verify the cross-shard transactions (since Lemma 15 essentially states they cannot all be verified). Thus, the adversary can simply attempt to double-spend the same UTXO across every shard and will succeed in the shards that do not verify the validity of the cross-shard transaction. Hence, consistency is not satisfied.

1.2 A.2 Bounds Under Uniform Shard Creation

In this section, we assume that the creation of shards is UTXO-dependent; transactions are assigned to shards independently and uniformly at random. This assumption is in sync with the proposed protocols in the literature. In a non-randomized process of creating shards, the adversary can precompute and thus bias the process in a permissionless system. Hence, all sharding proposals employ a random process for shard creation. Furthermore, all shards validate approximately the same amount of transactions; otherwise the efficiency of the protocol would depend on the shard that validates most transactions. For this reason, we assume the UTXO space is partitioned to shards uniformly at random. Note that we consider UTXOs to be random strings.

Under this assumption, we prove a constant fraction of transactions are cross-shard on expectation. As a result, we prove no sharding protocol can maintain a robust sharded ledger when participants act as light clients on all shards involved in cross-shard transactions. Our observations hold for any transaction distribution \(D_T\) that results in a constant fraction of cross-shard transactions.

Lemma 16

The expected number of cross-shard transactions is \(\varTheta (|T|)\).

Proof

Let \(Y_i\) be the random variable that shows if a transaction is cross-shard; \(Y_i=1\) if \(tx_i \in T\) is cross-shard, and 0 otherwise. Since UTXOs are assigned to shards uniformly at random, \(Pr[i\in x_k]=\frac{1}{m}\), for all \(i\in v\) and \(k \in [m]=\{1, 2,\dots , m \}\). The probability that all UTXOs in a transaction \(tx\in T\) belong to the same shard is \(\frac{1}{m^{v-1}}\) (where v is the cardinality of UTXOs in tx). Hence, \(Pr[Y_i=1]=1-\frac{1}{m^{v-1}}\). Thus, the expected number of cross-shard transactions is \(\mathbb {E}(\sum _{\forall tx_i \in T} Y_i)= |T|\big (1-\frac{1}{m^{v-1}} \big )\). Since, \(m(n)=\omega (1)\) (Lemma 14) and v constant, the expected cross-shard transactions converges to T for n sufficiently large.

Lemma 17

For any protocol that maintains a robust sharded transaction ledger, it holds \(\gamma =\varTheta (m)\).

Proof

We assume each transaction has a single input and output, hence \(v=2\). This is the worst-case input for evaluating how many shards interact per transaction; if \(v\gg 2\) then each transaction would most probably involve more than two shards and thus each shard would interact with more different shards for the same set of transactions.

For \(v=2\), we can reformulate the problem as a graph problem. Suppose we have a random graph G with m nodes, each representing a shard. Now let an edge between nodes u and w represent a transaction between shards u and w. Note that in this setting we allow self-loops, which represent the intra-shard transactions. We create the graph G with the following random process: We choose an edge independently and uniformly at random from the set of all possible edges including self-loops, denoted by \(E'\). We repeat the process independently |T| times, i. e., as many times as the cardinality of the transaction set. We note that each trial is independent and the edges chosen uniformly at random due to the corresponding assumptions concerning the transaction set and the shard creation. We will now show that the average degree of the graph is \(\varTheta (m)\), which immediately implies the statement of the lemma.

Let the random variable \(Y_i\) represent the existence of edge i in the graph, i. e., \(Y_i=1\) if edge i was created at any of the T trials, 0 otherwise. The set of all possible edges in the graph is E, \(|E|=\left( {\begin{array}{c}m\\ 2\end{array}}\right) =\frac{m(m-1)}{2}\). Note that this is not the same as set \(E'\) which includes self-loops and thus \(|E'|=\left( {\begin{array}{c}m\\ 2\end{array}}\right) +m=\frac{m(m+1)}{2}\). For any vertex u of G, it holds

$$\begin{aligned} \mathbb {E}[deg(u)]=\frac{2 \mathbb {E}[\sum _{\forall i \in E} Y_i]}{m} \end{aligned}$$

where deg(u) denotes the degree of node u. We have,

$$\begin{aligned} Pr[Y_i=1] = 1-Pr[Y_i=0] \end{aligned}$$
$$\begin{aligned} =\,1- Pr[Y_i=0 \text { at trial 1}] Pr[Y_i=0 \text { at trial 2}] \dots \end{aligned}$$
$$\begin{aligned} Pr[Y_i=0 \text { at trial T}]= 1-\Big (1-\frac{2}{m(m+1)} \Big )^{|T|} \end{aligned}$$

Thus,

$$\begin{aligned} \mathbb {E}[deg(u)]=\frac{2m(m-1)}{2}\Big [ 1-\Big (1-\frac{2}{m(m+1)} \Big )^{|T|} \Big ] \end{aligned}$$
$$\begin{aligned} =(m-1)\Big [ 1-\Big (1-\frac{2}{m(m+1)} \Big )^{|T|} \Big ] \end{aligned}$$

Therefore, for many transactions we have \(|T|=\omega (m^2)\) and consequently \(\mathbb {E}[deg(u)]= \varTheta (m)\).

Theorem 11

There is no protocol that maintains a robust sharded transaction ledger in our model under uniform space partition when parties are light nodes on the shards involved in cross-shard transactions.

Proof

Immediately follows from Lemmas 15 and 17.

1.3 A.3 Bounds Under Random Permutation of Parties to Shards

In this section, we assume parties are periodically randomly shuffled among shards, using a random permutation of their IDs. Any other shard assignment strategy yields equivalent or worse guarantees since we have no knowledge of which parties are Byzantine. Our goal is to upper bound the number of shards for a protocol that maintains a robust sharded transaction ledger in our security model. To satisfy the security properties, we demand each shard to contain at least a constant fraction of honest parties \(1-a\) \((< 1-\frac{f}{n})\), where a is the tolerance of the shards. This is due to classic lower bounds of consensus protocols [35].

The size of a shard is the number of the parties assigned to the shard. We say shards are balanced if all shards have approximately the same size. In what follows, we assume shards to be balanced (this can be done by drawing uniformly at random a balanced partition of parties). We denote by \(p=f/n\) the (constant) fraction of the Byzantine parties. A shard is a-honest if at least a fraction of \(1-a\) parties in the shard are honest.

The following lemma, proven by Raab and Steger [42] will be useful later:

Lemma 18

Let M be the random variable that counts the number of balls in any bin if we throw pn balls independently and uniformly at random into m bins. Then \(Pr[M>k_{\alpha }] = o(1)\) if \(\alpha >1\) and \(Pr[M>k_{\alpha }] = 1 - o(1)\) if \(0<\alpha < 1\), where

$$\begin{aligned} k_{\alpha } = {\left\{ \begin{array}{ll} \frac{\log {m}}{\log {\frac{m\log {m}}{pn}}} * (1+\alpha \frac{log^{(2)}\frac{m\log {m}}{pn}}{\log \frac{m\log {m}}{pn}}) &{} \text {if } \frac{m}{polylog(m)}\le pn \ll m\log {m},\\ (d_c-1+\alpha )\log {m} &{} \text {if } pn=cm\log {m} \text { for some constant }c,\\ \frac{pn}{m} + \alpha \sqrt{2\frac{pn}{m}\log {m}} &{} \text {if } m\log {m}\ll pn\le mpolylog(m), \\ \frac{pn}{m}+ \sqrt{\frac{2pn\log {m}}{m}(1-\frac{\log ^{(2)}{m}}{2\alpha \log {m}})} &{} \text {if } pn \gg m(\log {m})^{3} \end{array}\right. } \end{aligned}$$
(1)

Lemma 19

Given n parties are assigned uniformly at random to m shards of constant size \(s=\frac{n}{m}\) and the adversary corrupts at most \(f=pn\) parties, all shards are a-honest (pa are constants with p the proportion of corrupted parties and a the tolerance of the model) with probability \(1-o(1)\) if and only if the number of shards is at most \(n= cmlog(m)/p\), where c is a constant and p/a is small enough depending only on the value of c.

Proof

We start by reformulating the problem in order to show it is equivalent to the well-know Generalized Birthday Paradox.

Assuming we build m shards of equivalent size \(s=\frac{n}{m}\) using a random permutation with uniform probability. Then this is equivalent to distributing the Byzantine processes to shards at random following a uniform law, but with the shards being of maximum size s. In other words, we throw \(f=pn\) balls in m bins of limited capacity s. We would like to know the probability that the maximum load of the bins be greater or equal to a.

Reformulated as the Birthday paradox, what is the probability that, in a room of n people whose birthdays are spread uniformly at random over m days, a people share the same birthday? We denote that probability by f(pnma).

Notice that our reformulation as the Birthday Paradox does not take into account the limited size of the possible birthdays (no more than s people can have the same birthday). Both problems are however equivalent, as we can reconstruct that probability easily using Bayes’ formula:

$$P(A|B) = \frac{P(B|A)*P(A)}{P(B)} $$

Where \(A = \) “the maximum load is \(\le as\)”, \(B = \) “the maximum load is \(\le s\)” and \(A|B = C = \)"all shards are \(a-honest\). \(P(B|A) = 1\) since \(a<1\) so

$$ P(C) = \frac{P(A)}{P(B)} $$

hence solving the Birthday Paradox solves our problem with very little additional calculation. Our calculation will actually be conducted using \(A' = \) “the maximum load is \(\ge as\)” and \(B' = \) “the maximum load is \(\ge s\)

$$ P(C) = \frac{1-P(A')}{1-P(B')} $$

Since \(\frac{1-o(1)}{1-o(1)}\ge 1-o(1)\), it is sufficient for \(P(C) = 1-o(1)\) that \(P(A')=o(1)\) and \(P(B') = o(1)\). The problem is sometimes denoted as the Cell Occupancy Problem [23].

We then use Lemma 18 (beware, in the original paper [42] n and m are reverse when compared with our notation). We want \(\alpha > 1\), \(k_{\alpha } = \frac{an}{m}\).

When applying this, we immediately get impossible equations for the third and fourth values of \(k_{\alpha }\), hence it is not possible to have m in that range of values compared to n (\(m\gg nlog(n)\)):

$$ \frac{an}{m} = \frac{pn}{m} + \alpha \sqrt{2\frac{pn}{m}\log {m}}$$
$$ \frac{(a-p)n}{m} = \alpha \sqrt{2\frac{pn}{m}\log {m}}$$
$$ \frac{n}{m} = \frac{\alpha \sqrt{2p}}{(a-p)} \sqrt{\frac{n}{m}\log {m}}$$
$$ \sqrt{n} = \frac{\alpha \sqrt{2p}}{(a-p)} \sqrt{m\log {m}}$$
$$ n = \frac{\alpha ^22p}{(a-p)^2}m\log {m}$$

As we can see, we also violate the hypothesis that \(pn\gg m\log {m}\), which is absurd. For the fourth equation, we can simply notice that since \(\alpha >1\), \((1-\frac{\log ^{(2)}{m}}{2\alpha \log {m}}) \le 1\) hence reusing the calculation made for the third case n will be even smaller when compared with \(m\log {m}\), thus the hypothesis \(pn \gg m(\log {m})^{3}\) is broken.

The equations however is correct under the hypothesis that \(pn = cm\log {m}\) (see calculation below). This indicates that this is as high a value of m we can use while keeping the shards safe with overwhelming probability.

$$ \frac{an}{m} = (d_c - 1 + \alpha )\log {m} $$
$$ n = \frac{1}{a}(d_c - 1 + \alpha )m\log {m} $$

We can see already that we are indeed verifying the hypothesis \(pn = cm\log {m}\) for some constant c (the constant \(d_c\) is a scalar not dependant on either n or m). If \(k_{\alpha } = \frac{n}{m}\), then \(n = (d_c - 1 + \alpha )m\log {m}\) and the hypothesis is also verified.

We now need to make sure that \(\alpha > 1\) for both cases.

Since, by hypothesis, \(pn = cm\log {m}\), we identify that \(c = \frac{p}{a}(d_c - 1 + \alpha )\), where \(d_c \ge c\). In order to obtain \(\alpha > 1\), it is necessary that \(c > \frac{p}{a} d_c\) where \(p < a\). \(d_c\) is a function of c with \(d_c >c\), hence for a given c it is always possible to enforce \(\alpha > 1\) if p/a is small enough.

for the case \(k_{\alpha } = \frac{n}{m}\), the previous result holds trivially with \(a = 1\).

Using the previous calculations, we can exhibit the trade-off between security and scalability in a mathematical formulation in Corollary 20. A systems designer may choose to adjust either parameter p/a or c, one being computed thanks to the chosen value of the other. Since the expression is not mathematically intuitive, we provide a plotting of the increasing function \(p/a = g(c)\) in Fig. 1.

Fig. 1.
figure 1

\(p/a = g(c)\) as described in Corollary 20. p is the proportion of corrupted parties in the system, while \(1-a\) is the maximum proportion of corrupted parties allowed per shard.

Corollary 20

In a sharding protocol maintaining a robust sharded transaction ledger against an adversary, the trade-off between scalability (low value of c) and security (high value of p/a) is described by \(\frac{c}{d_c} > \frac{p}{a}\). c is the multiplicative constant in the relation \(pn=cm\log (m)\), \(d_c\) is a function of c, while p and \(1-a\) are the proportion of corrupted parties in the system and per shard, respectively.

Proof

According to Lemma 19, the constant \(d_c\) is a real number dependant only on c and

$$\frac{c}{d_c} > \frac{p}{a}$$

which means the value of p/a is ceiled by the value of \(c/d_c\).

As explained in [42], \(d_c\) is the solution to the equation \(1 + x(log(c) - log(x) + 1 ) - c = 0\) that is greater than c. Thus we have the exact mathematical expression of the well-known security/scalability trade-off.

Corollary 21

In a sharding protocol maintaining a robust sharded transaction ledger against an adversary, m is upper-bounded by \(f(n) = \frac{n}{c'\log (\frac{n}{c'\log (n)})}\) with \(c' = \frac{c}{p}\) and c a constant as described in Corollary 20.

Proof

Because of Lemma 19, \(cm\log (m)=pn\). using \(m = \frac{n}{c'\log (m)}\) (a), we obtain \(m = \frac{n}{c'\log (\frac{n}{c'\log (m)})}\) and since \( n \ge m\), an upper-bound is \(f(n) = \frac{n}{c'\log (\frac{n}{c'\log (n)})}\). Note we could build a tighter but more complex upper bound by replacing m by its expression (a) instead of n as many times as desired.

Next, we prove that any sharding protocol may scale at most by an \(n/\log {n}\) factor. This bound refers to independent nodes. If, for instance, we “shard” per authority, but all authorities represented in each shard, the bound of the theorem does not hold and the actual system should be considered sharded since every authority holds all the data.

Theorem 12

Any protocol that maintains a robust sharded transaction ledger in our model under uniformly random partition of the state and parties, can scale at most by a factor of m, where \(n=c'm\log m\) and the constant \(c'\) encompasses the trade-off between security and scalability.

Proof

In our security model, the adversary can corrupt \(f=pn\) parties, p constant. Hence, from Corollary 20, \(m = O(\frac{n}{\log m})\). Each party stores at least T/m transactions on average and thus the expected space factor is \(\omega _s \ge n\frac{T/m}{T}=\frac{n}{m}\). Therefore, any sharding protocol can scale at most \(O(\frac{n}{\log m})\).

Next, we show that any sharding protocol that satisfies scalability requires some process of verifiable compaction of state such as checkpoints [33], cryptographic accumulators [10], zero-knowledge proofs [9], non-interactive proofs of proofs-of-work [15, 28], proof of necessary work [27] or erasure codes [26]. Such a process allows the state of the distributed ledger (e.g., stable transactions) to be compressed significantly while users can verify the correctness of the state. Intuitively, in any sharding protocol secure against a slowly adaptive adversary parties must periodically shuffle in shards. To verify new transactions the parties must receive a verifiably correct UTXO pool for the new shard without downloading the full shard history; otherwise the communication overhead of the bootstrapping process eventually exceeds that of a non-sharded blockchain. Although existing evaluations typically ignore this aspect with respect to bandwidth, we stress its importance in the long-term operation: the bootstrap cost will eventually become the bottleneck due to the need for nodes to regularly shuffle.

Theorem 13

Any protocol that maintains a robust sharded transaction ledger in our model, under uniformly random partition of the state and parties, employs verifiable compaction of the state.

Proof

(Towards contradiction). Suppose there is a protocol that maintains a robust sharded ledger without employing any process that verifiably compacts the blockchain. To guarantee security against a slowly-adaptive adversary, the parties change shards at the end of each epoch. At the beginning of each epoch, the parties must process a new set of transactions. To check the validity of this new set of transactions, each (honest) shard member downloads and maintains the corresponding ledger. Note that even if the party only maintains the hash-chain of a ledger, the cost is equivalent to maintaining the list of transactions given that the block size is constant. We will show that the communication factor increases with time, eventually exceeding that of a non-sharded blockchain; thus scalability is not satisfied from that point on.

In each epoch transition, a party changes shards with probability \(1-1/m\), where m is the number of shards. As a result, a party changing a shard in epoch k must download the shard’s ledger of size \(\dfrac{k\cdot T}{m}\). Therefore, the expected communication factor of bootstrapping during the k-th epoch transition is \(\dfrac{k\cdot T}{m} \cdot (1-\dfrac{1}{m})\). We observe the communication overhead grows with the number of epochs k, hence it will eventually become the scaling bottleneck. For instance, for \(k> m\cdot n\), the communication factor is greater than linear to the number of parties in the system n, thus the protocol does not satisfy scalability.

Theorem 13 holds even if parties are not assigned to shards uniformly at random but follow some other shuffling strategy like in [43]. As long as a significant fraction of honest parties change shards from epoch to epoch, verifiable compaction of state is necessary to restrict the bandwidth requirements during bootstrapping in order to satisfy scalability.

B Analysis

We show that Divide & Scale is secure in our model (i. e., satisfies persistence, consistency, and liveness), while its efficiency (i. e., scalability and throughput factor) depends on the chosen subprotocols. For the purpose of our analysis, we assume all employed subprotocols satisfy liveness.

Theorem 22

Divide & Scale satisfies persistence in our system model assuming at most f Byzantine nodes.

Proof

Assuming Sybil guarantees the fair distribution of identities (Sybil, property iv), and Divide2Shard maintains the distribution within the desired limits to guarantee the securities bounds of Consensus (Divide2Shard, property iii), the common prefix property is satisfied in each shard, so persistence is satisfied.

Theorem 23

Divide & Scale satisfies consistency in our system model assuming at most f Byzantine nodes.

Proof

Transactions can either be intra-shard (all UTXOs within a single shard) or cross-shard. Consistency is satisfied for intra-shard transactions as long as Sybil and Divide2Shard result in a distribution that respects the security bounds of Consensus, hence the common prefix property is satisfied. Furthermore, consistency is satisfied for cross-shard transactions from the CrossShard protocol as long as it correctly provides atomicity.

Theorem 24

Divide & Scale satisfies liveness in our system model assuming at most f Byzantine nodes.

Proof

Follows from the assumption that all subprotocols satisfy liveness, as well as the CompactState protocol that ensures data availability between epochs.

Scalability. The scalability of Divide & Scale depends on the worse scaling factor, i. e., communication, space, computation, of all the components it employs. The maximum scaling factor for DRG, Divide2Shards, Sybil, and CompactState can be amortized over the rounds of an epoch because these protocols are executed once per epoch. Thus, the size of an epoch is critical for scalability. Intuitively, this implies that if the size of the epoch is small, hence the adversary highly-adaptive, sharding is not that beneficial as the protocols that are executed on the epoch transaction are as resource demanding as the consensus in a non-sharded system.

Throughput Factor. Similarly to scalability, the throughput factor also depends on the chosen subroutines, and in particular, Consensus and CrossShards. To be specific, the throughput factor depends on the shard growth and shard quality parameters which are determined by Consensus. In addition, given a transaction input, the degree of parallelism, which is the last component of the throughput factor, is determined by the maximum number of shards possible and the way cross-shard transactions are handled. The maximum number of shards depends on Consensus and Divide2Shards, while CrossShard determines how many shards are affected by a single transaction. For instance, if the transactions are divided in shards uniformly at random, Divide & Scale can scale at most by \(n/\log n\) as stated in Corollary 20. We further note that the minimum number of affected shards for a specific transaction is the number of UTXOs that map to different shards; otherwise security cannot be guaranteed.

We demonstrate in Appendix C how to calculate the scaling factors and the throughput factor for OmniLedger and RapidChain.

C Evaluation of Existing Protocols

In this section, we evaluate existing sharding protocols in our model with respect to the desired properties defined in Sect. 2.2. A summary of our evaluation can be found in Table 1 in Sect. 5.

The analysis is conducted in the synchronous model and thus any details regarding performance on periods of asynchrony are discarded. The same holds for other practical refinements that do not asymptotically improve performance.

1.1 C.1 Elastico

Overview. Elastico is the first distributed blockchain sharding protocol introduced by Luu et al. [36]. The protocol lies in the intersection of traditional BFT protocols and the Nakamoto consensus. The protocol is synchronous and proceeds in epochs. The setting is permissionless, and during each epoch, the participants create valid identities for the next epoch by producing proof-of-work (PoW) solutions. The adversary is slowly-adaptive (see Sect. 2) and controls at most \(25\%\) of the computational power of the system or equivalently \(f< \frac{n}{4}\) out of n valid identities in total.

At the beginning of each epoch, parties are partitioned into small shards (committees) of constant size c. The number of shards is \(m=2^s\), where s is a small constant such that \(n=c \cdot 2^s\). A shard member contacts its directory committee to identify the other members of the same shard. For each party, the directory committee consists of the first c identities created in the epoch in the party’s local view. Transactions are randomly partitioned in disjoint sets based on the hash of the transaction input (in the UTXO model); hence, each shard only processes a fraction of the total transactions in the system. The shard members execute a BFT protocol to validate the shard’s transactions and then send the validated transactions to the final committee. The final committee consists of all members with a fixed s-bit shards identity, and is in charge of two operations: (i) computing and broadcasting the final block, which is a digital signature on the union of all valid received transactionsFootnote 6 (via executing a BFT protocol), and (ii) generating and broadcasting a bounded exponential biased random string to be used as a public source of randomness in the next epoch (e.g. for the PoW).

Consensus: Elastico does not specify the consensus protocol but instead can employ any standard BFT protocol, like PBFT [17].

CrossShard & StatePartition: Each transaction is assigned to a shard according to the hash of the transaction’s inputs. Every party maintains the entire blockchain, thus each shard can validate the assigned transaction independently, i. e., there are no cross-shard transactions. Note that Elastico assumes that transactions have a single input and output, which is not the case in cryptocurrencies as discussed in Sect. 3. To generalize Elastico ’s transaction assignment method to multiple inputs, we assume each transaction is assigned to the shard corresponding to the hash of all its inputs. Otherwise, if each input is assigned to a different shard according to its hash value, an additional protocol is required to guarantee the atomicity of transactions and hence the security (consistency) of Elastico.

Sybil: Participants create valid identities by producing PoW solutions using the randomness of the previous epoch.

Divide2Shards & CompactState: The protocol assigns each identity to a random shard in \(2^s\), identified by an s-bit shard identity. At the end of each epoch, the final committee broadcasts the final block that contains the Merkle hash root of every block of all shards’ block. The final block is stored by all parties in the system. Hence, when the parties are re-assigned to new shards they already have the hash-chain to confirm the shard ledger and future transactions. Essentially, an epoch in Elastico is equivalent to a block generation round.

DRG: In each epoch, the final committee (of size c) generates a set of random strings R via a commit-and-XOR protocol. First, all committee members generate an r-bit random string \(r_i\) and send the hash \(h(r_i)\) to all other committee members. Then, the committee runs an interactive consistency protocol to agree on a single set of hash values S, which they include on the final block. Later, each (honest) committee member broadcasts its random string \(r_i\) to all parties in the network. Each party chooses and XORs \(c/2+1\) random strings for which the corresponding hash exists in S. The output string is the party’s randomness for the epoch. Note that \(r>2\lambda +c-\log (c)/2\), where \(\lambda \) is a security parameter.

Analysis. Elastico ’s threat model allows for adversaries that can drop or modify messages, and send different messages to honest parties, which is not allowed in our model. However, we show that even under a more restrictive adversarial model, Elastico fails to meet the desired sharding properties. Specifically, we prove Elastico does not satisfy scalability and consistency. From the security analysis of [36], it follows that Elastico satisfies persistence and liveness in our system model.

Theorem 25

Elastico does not satisfy consistency in our system model.

Proof

Suppose a party submits two valid transactions, one spending input x and another spending input x and input y. Note that the second is a single transaction with two inputs. In this case, the probability that both hashes (transactions), H(xy) and H(x), land in the same shard is 1/m. Hence, the probability of a successful double-spending in a set of T transactions is almost \(1-(1/m)^T\), which converges to 0 as T grows, for any value \(m>1\). However, \(m>1\) is necessary to satisfy scalability (Lemma 14). Therefore, there will be almost surely a round in which two parties report two conflicting transactions. Since the final committee does not verify the validity of transactions but only checks the appropriate signatures are present, consistency is not satisfied.

Lemma 26

The communication and space factors of Elastico are \(\omega _m=\varTheta (n)\) and \(\omega _s=\varTheta (1)\).

Proof

At the end of each epoch, which corresponds to the generation of one block per shard, the final committee broadcasts the final block to the entire network. All parties download and store the final block. hence all parties maintain the entire input set of transactions. Since the block size is considered constant, downloading and storing the final block which consists of the hash-chains of all shards is equivalent to downloading and storing all the shards’ ledgers. It follows that the space factor is \(\omega _s=\varTheta (1)\) as all parties store a constantly-compressed version of the input T, regardless of the nature of the input set T. Similarly, it follows that the communication factor is \(\omega _m=\varTheta (n)\) as the broadcast of the final block takes place regularly at the generation of one block per shard, i. e., Elastico ’s epoch.   \(\square \)

Theorem 27

Elastico does not satisfy scalability in our system model.

Proof

Immediately follows from Definition 4 and Lemma 26.   \(\square \)

1.2 C.2 Monoxide

Overview. Monoxide [53] is an asynchronous proof-of-work protocol, where the adversary controls at most \(50\%\) of the computational power of the system. The protocol uniformly partitions the space of user addresses into shards (zones) according to the first k bits. Every party is permanently assigned to a shard uniformly at random. Each shard employs the GHOST [47] consensus protocol.

Participants are either full-nodes that verify and maintain the transaction ledgers, or miners investing computational power to solve PoW puzzles for profit in addition to being full-nodes. Monoxide introduces a new mining algorithm, called Chu-ko-nu, that enables miners to mine in parallel for all shards. The Chu-ko-nu algorithm aims to distribute the hashing power to protect individual shards from an adversarial takeover. Successful miners include transactions in blocks. A block in Monoxide is divided into two parts: the chaining block that includes all metadata (Merkle root, nonce for PoW, etc.) creating the hash-chain, and the transaction-block that includes the list of transactions. All parties maintain the hash-chain of every shard in the system.

Furthermore, all parties maintain a distributed hash table for peer discovery and identifying parties in a specific shard. This way the parties of the same shard can identify each other and cross-shard transactions are sent directly to the destination shard. Cross-shard transactions are validated in the shard of the payer and verified from the shard of the payee via a relay transaction and the hash-chain of the payer’s shard.

Consensus: The consensus protocol of each shard is GHOST [47]. GHOST is a DAG-based consensus protocol similar to Nakamoto consensus [40], but the consensus selection rule is the heaviest subtree instead of the longest chain.

StatePartition: Monoxide is account-based hence all transactions are single input and single output.

CrossShard: An input shard is a shard that corresponds to the address of a sender of a transaction (payer) while an output shard one that corresponds to the address of a receiver of a transaction (payee). Each cross-shard transaction is processed in the input shard, where an additional relay transaction is created and included in a block. The relay transaction consists of all metadata needed to verify the validity of the original transaction by only maintaining the hash-chain of a shard (i. e. for light nodes). The miner of the output shard verifies that the relay transaction is stable and then includes it in a block in the output shard. Note that in case of forks in the input shard, Monoxide invalidates the relay transactions and rewrites the affected transaction ledger to maintain consistency.

Sybil: In a typical PoW election scheme, the adversary can create many identities and target its computational power to specific shards to gain control over more than half of the shard’s participants. In such a case, the security of the protocol fails (both persistence and consistency properties do not hold). To address this issue, Monoxide introduces a new mining algorithm, Chu-ko-nu, that allows parallel mining on all shards. Specifically, a miner can batch valid transactions from all shards and use the root of the Merkle tree of the list of chaining headers in the batch as input to the hash, alongside with the nonce (and some configuration data). Thus, when a miner successfully computes a hash lower than the target, the miner adds a block to every shard.

Divide2Shards: Parties are permanently assigned to shards uniformly at random according the first k bits of their address.

DRG: The protocol uses deterministic randomness (e.g. hash function) and does not require any random source.

CompactState: No compaction of state is used in Monoxide.

Analysis. We prove that Monoxide satisfies persistence, liveness, and consistency, but does not satisfy scalability. The same result is also immediately derived from our impossibility result stated in Theorem 11 as Monoxide demands each party to verify cross-shard transactions by acting as a light node to all shards; effectively demonstrating the effectiveness of our framework and the usability of our results.

Theorem 28

Monoxide satisfies persistence and liveness in our system model for \(f<n/2\).

Proof

From the analysis of Monoxide, it holds that if all honest miners follow the Chu-ko-nu mining algorithm, then honest majority within each shard holds with high probability for any adversary with \(f<n/2\) (Sect. 5.3 [53]).

Assuming honest majority within shards, persistence depends on two factors: the probability a stable transaction becomes invalid in a shard’s ledger, and the probability a cross-shard transaction is reverted after being confirmed. Both these factors solely depend on the common prefix property of the shards’ consensus mechanism. Monoxide employs GHOST as the consensus mechanism of each shard, hence the common prefix property is satisfied if we assume that invalidating the relay transaction does not affect other shards [29]. Suppose common prefix is satisfied with probability \(1-p\) (which is overwhelming on the “depth” security parameter k). Then, the probability none of the outputs of a transaction are invalidated is \((1-p)^{(v-1)}\) (worst case where \(v-1\) outputs – relay transactions – link to one input). Thus, a transaction is valid in a shard’s ledger after k blocks with probability \((1-p)^v\), which is overwhelming in k since v is considered constant. Therefore, persistence is satisfied.

Similarly, liveness is satisfied within each shard. Furthermore, this implies liveness is satisfied for cross-shard transactions. In particular, both the initiative and relay transactions will be eventually included in the shards’ transaction ledgers, as long as chain quality and chain growth are guaranteed within each shard [29].   \(\square \)

Theorem 29

Monoxide satisfies consistency in our system model for \(f<n/2\).

Proof

The common prefix property is satisfied in GHOST [30] with high probability. Thus, intra-shard transactions satisfy consistency with high probability (on the “depth” security parameter). Furthermore, if a cross-shard transaction output is invalidated after its confirmation, Monoxide allows rewriting the affected transaction ledgers. Hence, consistency is restored in case of cross-transaction failure. Thus, overall, consistency is satisfied in Monoxide.   \(\square \)

Note that allowing to rewrite the transaction ledgers in case a relay transaction is invalidated strengthens the consistency property but weakens the persistence and liveness properties.

Intuitively, to satisfy persistence in a sharded PoW system, the adversarial power needs to be distributed across shards. To that end, Monoxide employs a new mining algorithm, Chu-ko-nu, that incentivizes honest parties to mine in parallel on all shards. However, this implies that a miner needs to verify transactions on all shards and maintain a transaction ledger for all shards. Hence, the computation and space factors are proportional to the number of (honest) participants and the protocol does not satisfy scalability.

Theorem 30

Monoxide does not satisfy scalability in our system model for \(f<n/2\).

Proof

Let m denote the number of shards (zones), \(m_p\) the fraction of mining power running the Chu-ko-nu mining algorithm and \(m_d\) the rest of the mining power (\(m_p+m_d=1\)). Additionally, suppose \(m_s\) denotes the mining power of one shard. The Chu-ko-nu algorithm enforces the parties to verify transactions that belong to all shards, hence the parties store all sharded ledgers. To satisfy scalability, the space factor of Monoxide can be at most o(1). Similarly, it follows that the verification overhead expressed through the computational factor must be bounded by o(n). Thus, at most o(n) parties can run the Chu-ko-nu mining algorithm, hence \(n m_p=o(n)\). We note that the adversary will not participate in the Chu-ko-nu mining algorithm as distributing the hashing power is to the adversary’s disadvantage.

To satisfy persistence, every shard running the GHOST protocol [47] must satisfy the common prefix property. Thus, the adversary cannot control more than \(m_a < m_s/2\) hash power, where \(m_s=\frac{m_d}{m}+m_p\). Consequently, we have \(m_a < \frac{m_s}{2(m_d+m_p)} = \frac{1}{2}- \frac{m_d(m-1)}{2m(m_d+m_p)}\). For n sufficiently large, \(m_p\) converges to 0; hence \(m_a < \frac{1}{2}- \frac{(m-1)}{2m} = \frac{1}{2m}\). From Lemma 14, \(m=\omega (1)\), thus the adversarial power \(m_a < 0\) for sufficiently large n. We conclude that Monoxide does not satisfy scalability in our model. Moreover, we identify in Monoxide a clear trade-off between security and scaling storage and verification.    \(\square \)

1.3 C.3 OmniLedger

Overview. OmniLedger [33] proceeds in epochs, assumes a partially synchronous model within each epoch (to be responsive), synchronous communication channels between honest parties (with a large maximum delay), and a slowly-adaptive computationally-bounded adversary that can corrupt up to \(f<n/4\) parties.

The protocol bootstraps using techniques from ByzCoin [31]. The core idea is that there is a global identity blockchain that is extended once per epoch with Sybil resistant proofs (proof-of-work, proof-of-stake, or proof-of-personhood [12]) coupled with public keys. At the beginning of each epoch a sliding window mechanism is employed to define the eligible validators as the ones with identities in the last W blocks, where W depends on the adaptivity of the adversary. For our definition of slowly adaptive, we set \(W=1\). The UTXO space is partitioned uniformly at random into m shards, each shard maintaining its own ledger.

At the beginning of each epoch, a new common random value is created via a distributed randomness generation (DRG) protocol. The DRG protocol employs verifiable random functions (VRF) to elect a leader who runs RandHound [51] to create the random value. The random value is used as a challenge for the next epoch’s identity registration and as a seed to assigning identities of the current epoch into shards.

Once the participants for this epoch are assigned to shards and bootstrap their internal states, they start validating transactions and updating the shards’ transaction ledgers by operating ByzCoinX, a modification of ByzCoin [31]. When a transaction is cross-shard, a protocol that ensures the atomic operation of transactions across shards called Atomix is employed. Atomix is a client-driven atomic commit protocol secure against Byzantine adversaries.

Consensus: OmniLedger suggests the use of a strongly consistent consensus in order to support Atomix. This modular approach means that any consensus protocol [17, 25, 31, 32, 44] works with OmniLedger as long as the deployment setting of OmniLedger respects the limitations of the consensus protocol. In its experimental deployment, OmniLedger uses a variant of ByzCoin [31] called ByzCoinX [32] in order to maintain the scalability of ByzCoin and be robust as well. We omit the details of ByzCoinX as it is not relevant to our analysis.

StatePartition: The UTXO space is partitioned uniformly at random into m shards.

CrossShard (Atomix): Atomix is a client-based adaptation of two-phase atomic commit protocol running with the assumption that the underlying shards are correct and never crash. This assumption is satisfied because of the random assignment of parties to shards, as well as the Byzantine fault-tolerant consensus of each shard.

In particular, Atomix works in two steps: First, the client that wants the transaction to go through requests a proof-of-acceptance or proof-of-rejection from the shards managing the inputs, who log the transactions in their internal blockchain. Afterwards, the client either collects proof-of-acceptance from all the shards or at least one proof-of-rejection. In the first case, the client communicates the proofs to the output shards, who verify the proofs and finish the transaction by generating the necessary UTXOs. In the second case, the client communicates the proofs to the input shards who revert their state and abort the transaction. Atomix, has a subtle replay attack, hence we analyze OmniLedger with the proposed fix [48].

Sybil: A global identity blockchain with Sybil resistant proofs coupled with public keys is extended once per epoch.

Divide2Shards: Once the parties generate the epoch randomness, the parties can independently compute the shard they are assigned to for this epoch by permuting (mod n) the list of validators (available in the identity chain).

DRG: The DRG protocol consists of two steps to produce unbiasable randomness. On the first step, all parties evaluate a VRF using their private key and the randomness of the previous round to generate a “lottery ticket”. Then the parties broadcast their ticket and wait for \(\varDelta \) to be sure that they receive the ticket with the lowest value whose generator is elected as the leader of RandHound.

This second step is a partially-synchronous randomness generation protocol, meaning that even in the presence of asynchrony safety is not violated. If the leader is honest, then eventually the parties will output an unbiasable random value, whereas if the leader is dishonest there are no liveness guarantees. To recover from this type of fault the parties can view-change the leader and go back to the first step in order to elect a new leader.

This composition of randomness generation protocols (leader election and multiparty generation) guarantees that all parties agree on the final randomness (due to the view-change) and the protocol remains safe in asynchrony. Furthermore, if the assumed synchrony bound (which can be increasing like PBFT [17]) is correct, an honest leader will be elected in a constant number of rounds.

Note, however, that the DRG protocol is modular, thus any other scalable distributed randomness generation protocol with similar guarantees, such as Hydrand [45] or Scrape [16], can be used.

CompactState: A key component that enables OmniLedger to scale is the epoch transition. At the end of every epoch, the parties run consensus on the state changes and append the new state (e.g. UTXO pool) in a state-block that points directly to the previous epoch’s state-block. This is a classic technique [17] during reconfiguration events of state machine replication algorithms called checkpointing. New validators do not replay the actual shard’s ledger but instead, look only at the checkpoints which help them bootstrap faster.

In order to guarantee the continuous operation of the system, after the parties finish the state commitment process, the shards are reconfigured in small batches (at most 1/3 of the parties in each shard at a time). If there are any blocks committed after the state-block, the validators replay the state-transitions directly.

Analysis. In this section, we prove OmniLedger satisfies persistence, consistency, and scalability (on expectation) but fails to satisfy liveness. Nevertheless, we estimate the efficiency of OmniLedger by providing an upper bound on its throughput factor.

Lemma 31

At the beginning of each epoch, OmniLedger provides an unbiased, unpredictable, common to all parties random value (with overwhelming probability in t within t rounds).

Proof

If the elected leader that orchestrates the distributed randomness generation protocol (RandHound or equivalent) is honest the statement holds. On the other hand, if the leader is Byzantine, the leader cannot affect the security of the protocol, meaning the leader cannot bias the random value. However, a Byzantine leader can delay the process by being unresponsive. We show that there will be an honest leader, hence the protocol will output a random value, with overwhelming probability in the number of rounds t.

The adversary cannot pre-mine PoW puzzles, because the randomness of each epoch is used in the PoW calculation of the next epoch. Hence, the expected number of identities the adversary will control (number of Byzantine parties) in the next epoch is \(f<n/4\). Hence, the adversary will have the smallest ticket – output of the VRF – and thus will be the leader that orchestrates the distributed randomness generation protocol (RandHound) with probability 1/2. Then, the probability there will be an honest leader in t rounds is \(1-\frac{1}{2^t}\), which is overwhelming in t.

The unpredictability is inherited by the properties of the employed distributed randomness generation protocol.    \(\square \)

Lemma 32

The distributed randomness generation protocol has \(O(\frac{n\log ^2 n}{R})\) amortized communication complexity, where R is the number of rounds in an epoch.

Proof

The DRG protocol inherits the communication complexity of RandHound, which is \(O(c^2n)\) [45]. In [51], the authors claim that c is constant. However, the protocol requires a constant fraction of honest parties (e.g. n/3) in each of the n/c partitions of size c against an adversary that can corrupt a constant fraction of the total number of parties (e.g. n/4). Hence, from Lemma 19, we have \(c=\varOmega (\log n)\), which leads to communication complexity \(O(n\log ^2n)\) for each epoch. Assuming each epoch consist of R rounds, the amortized per round communication complexity is \(O(\frac{n\log ^2 n}{R})\).    \(\square \)

Corollary 33

In each epoch, the expected size of each shard is n/m.

Proof

Due to Lemma 31, the n parties are assigned independently and uniformly at random to m shards. Hence, the expected number of parties in a shard is n/m.

   \(\square \)

Lemma 34

In each epoch, all shards are \(\frac{1}{3}\)-honest for \(m\le f(n)\) with f(n) as described in Corollary 21.

Proof

Due to Lemma 31, the n parties are assigned independently and uniformly at random to m shards. Since \(a=1/3 > p=1/4\), both ap constant, the statement holds from Lemma 19 and Corollary 21.    \(\square \)

Note that the bound is theoretical and holds for a large number of parties since the probability tends to 1 as the number of parties grows. For practical bounds, we refer to OmniLedger ’s analysis [33].

Theorem 35

OmniLedger satisfies persistence in our system model for \(f<n/4\).

Proof

From Lemma 34, each shard has an honest supermajority \(\frac{2}{3}\frac{n}{m}\) of participants. Hence, persistence holds by the common prefix property of the consensus protocol of each shard. Specifically, for ByzCoinX, persistence holds for depth parameter \(k=1\) because ByzCoinX guarantees finality.    \(\square \)

Theorem 36

OmniLedger does not satisfy liveness in our system model for \(f<n/4\).

Proof

To estimate the liveness of the protocol, we need to examine all the subprotocols: (i) Consensus, (ii) CrossShard or Atomix, (iii) DRG, (iv) CompactState, and (v) Divide2Shards.

Consensus: From Lemma 34, each shard has an honest supermajority \(\frac{2}{3}\frac{n}{m}\) of participants. Hence, in this stage liveness holds by chain growth and chain quality properties of the underlying blockchain protocol (an elaborate proof can be found in [24]). The same holds for CompactState as it is executed similarly to Consensus.

CrossShard: Atomix guarantees liveness since the protocol’s efficiency depends on the consensus of each shard involved in the cross-shard transaction. Note that liveness does not depend on the client’s behavior; if the appropriate information or some part of the transaction is not provided in multiple rounds to the parties of the protocol then the liveness property does not guarantee the inclusion of the transaction in the ledger. Furthermore, if some other party wants to continue the process it can collect all necessary information from the ledgers of the shards.

DRG: During the epoch transition, the DRG protocol provides a common random value with overwhelming probability within t rounds (Lemma 31). Hence, liveness is satisfied in this subprotocol as well.

Divide2Shrds: Liveness is not satisfied in this protocol. The reason is that a slowly-adaptive adversary can select who to corrupt during epoch transition, and thus can corrupt a shard from the previous epoch. Since the compact state has not been disseminated in the network, the adversary can simply delete the shard’s state. Thereafter, the data unavailability prevents the progress of the system.    \(\square \)

Theorem 37

OmniLedger satisfies consistency in our system model for \(f<n/4\).

Proof

Each shard is \(\frac{1}{3}\)-honest (Lemma 34). Hence, consistency holds within each shard, and the adversary cannot successfully double-spend. Nevertheless, we need to guarantee consistency even when transactions are cross-shard. OmniLedger employs Atomix, a protocol that guarantees cross-shard transactions are atomic. Thus, the adversary cannot validate two conflicting transactions across different shards.

Moreover, the adversary cannot revert the chain of a shard and double-spend an input of a cross-shard transaction after the transaction is accepted in all relevant shards because persistence holds (Theorem 35). Suppose persistence holds with probability p. Then, the probability the adversary breaks consistency in a cross-shard transaction is the probability of successfully double-spending in one of the relevant to the transaction shards, \(1-p^v\), where v is the average size of transactions. Since v is constant, consistency holds with high probability, given that persistence holds with high probability.    \(\square \)

To prove OmniLedger satisfies scalability (on expectation) we need to evaluate the scaling factors in the following subprotocols of the system: (i) Consensus, (ii) CrossShard, (iii) DRG, and (iv) Divide2Shards. Note that CompactState is merely an execution of Consensus.

Lemma 38

The scaling factors of Consensus are \(\omega _m=O({n}/{m})\), \(\omega _s=O({1}/{m})\), and \(\omega _c=O({n}/{m})\).

Proof

From Corollary 33, the expected number of parties in a shard is n/m. ByzCoin has quadratic to the number of parties’ worst-case communication complexity, hence the communication factor of the protocol is O(n/m). The verification complexity collapses to the communication complexity. The space factor is O(1/m), as each party maintains the ledger of the assigned shard for the epoch.

   \(\square \)

Lemma 39

The communication factor of Atomix (CrossShard) is \(\omega _m=O(v\frac{n}{m})\), where v is the average size of transactions.

Proof

In a cross-shard transaction, Atomix allows the participants of the output shards to verify the validity of the transaction’s inputs without maintaining any information on the input shards’ ledgers. This holds due to persistence (see Theorem 35).

Furthermore, the verification process requires each input shard to verify the validity of the transaction’s inputs and produce a proof-of-acceptance or proof-of-rejection. This corresponds to one query to the verification oracle for each input. In addition, each party of an output shard must verify that all proofs-of-acceptance are present and no shard rejected an input of the cross-shard transaction. The proof-of-acceptance (or rejection) consists of the signature of the shard which is linear to the number of parties in the shard. The relevant parties have to receive all the information related to the transaction from the client (or leader), hence the communication factor is \(O(v\frac{n}{m})\).

So far, we considered the communication complexity of Atomix. However, each input must be verified within the corresponding input shard. From Lemma 38, we get that the communication factor at this step is \(O(v\frac{n}{m})\).

Lemma 40

The communication factor of Divide2Shards is \(\omega _m=O(\frac{n}{mR})\), while the space factor is \(\omega _s=O(1/R)\), where R is size of an epoch.

Proof

During the epoch transition each party is assigned to a shard uniformly at random and thus most probably needs to bootstrap to a new shard, meaning the party must store the new shard’s ledger. At this point, within each shard OmniLedger introduces checkpoints, the state blocks that summarize the state of the ledger (CompactState). Therefore, when a party syncs with a shard’s ledger, it does not download and store the entire ledger but only the active UTXO pool corresponding to the previous epoch’s state block.

For security reasons, each party that is reassigned to a new shard must receive the state block of the new shard by O(n/m) parties. Thus, the communication complexity of the protocol is \(O(\frac{n}{mR})\) amortized per round, where R is the number of rounds in an epoch.

The space complexity is constant but amortized over the epoch length since the state block has a constant size and is broadcast once per epoch, \(\omega _s=O(1/R)\). There is no verification process at this stage.    \(\square \)

Theorem 41

OmniLedger satisfies scalability in our system model for \(f<n/4\) with communication and computational factor O(n/m) and space factor O(1/m), where \(n=O(m \log m)\).

Proof

To evaluate the scalability of OmniLedger, we need to estimate the dominating scaling factors of all the subprotocols of the system: (i) Consensus, (ii) CrossShard, (iii) DRG, and (iv) Divide2Shards.

The scaling factors of Consensus are \(\omega _m=O({n}/{m})\), \(\omega _s=O({1}/{m})\), and \(\omega _c=O({n}/{m})\) (Lemma 38), while Atomix (CrossShard) has expected communication factor \(O(v\frac{n}{m})\) (Lemma 39) where the average size of transaction v is constant (see Sect. 3).

The epoch transition consists of the DRG, CompactState, and Divide2Shards protocols. We assume a large enough epoch in rounds, \(R=\varOmega (n \log n)\), in order to amortize the communication-heavy protocols that are executed only once per epoch. CompactState has the same overhead as Consensus hence it is not critical. For \(R=\varOmega (n \log n)\), DRG has an expected amortized communication factor \(O(\log n)\) (Lemma 32), while Divide2Shards has an expected amortized communication factor of \(\omega _m=O(\frac{1}{m \log n})\) and an amortized space factor of \(\omega _s=O(1/R)= O(\frac{1}{n \log n})\)(Lemma 40).

Overall, considering the worst of the aforementioned scaling factors for OmniLedger, we have expected communication and computational factors O(n/m) and space factor O(1/m), where \(n=O(m \log m)\) (see Lemma 14 and Lemma 34).    \(\square \)

Theorem 42

In OmniLedger, the throughput factor is \(\sigma =\mu \cdot \tau \cdot \dfrac{m}{v} < \frac{\mu \cdot \tau \cdot f(n)}{v} \) where \(f(n) = \frac{n}{c'\log (\frac{n}{c'\log (n)})}\) with \(c' = \frac{c}{p}\) and c a constant as described in Corollary 20.

Proof

In Atomix, at most v shards are affected per transaction, thus \(m'<m/v\)Footnote 7. From Lemma 19 and Corollary 21, \(n \le f(n)\). Therefore, \(\sigma < \frac{\mu \cdot \tau \cdot f(n)}{v}\)    \(\square \)

The parameter v depends on the input transaction set. The parameters \(\mu , \tau , a, p \) depend on the choice of the consensus protocol. Specifically, \(\mu \) represents the ratio of honest blocks in the chain of a shard. On the other hand, \(\tau \) depends on the latency of the consensus protocol, i. e., what is the ratio between the propagation time and the block generation time. Last, a expresses the resilience of the consensus protocol (e.g., 1/3 for PBFT), while p the fraction of corrupted parties in the system (\(f=pn\)).

In OmniLedger, the consensus protocol is modular, so we chose to maintain the parameters for a fairer comparison to other protocols.

1.4 C.4 RapidChain

Overview. RapidChain [55] is a synchronous protocol and proceeds in epochs. The adversary is slowly-adaptive, computationally-bounded and corrupts less than 1/3 of the participants (\(f<n/3\)).

The protocol bootstraps via a committee election protocol that selects \(O(\sqrt{n})\) parties – the root group. The root group generates and distributes a sequence of random bits used to establish the reference committee. The reference committee consists of \(O(\log n)\) parties, is re-elected at the end of each epoch, and is responsible for: (i) generating the randomness of the next epoch, (ii) validating the identities of participants for the next epoch from the PoW puzzle, and (iii) reconfiguring the shards from one epoch to the next (to protect against single shard takeover attacks).

The parties are divided into shards of size \(O(\log n)\) (committees). Each shard handles a fraction of the transactions, assigned based on the prefix of the transaction ID. Transactions are sent by external users to an arbitrary number of active (for this epoch) parties. The parties then use an inter-shard routing scheme (based on Kademlia [38]) to send the transactions to the input and output shards, i. e., the shards handling the inputs and outputs of a transaction, resp.

To process cross-shard transactions, the leader of the output shard creates an additional transaction for every different input shard. Then the leader sends (via the inter-shard routing scheme) these transactions to the corresponding input shards for validation. To validate transactions (i. e., a block), each shard runs a variant of the synchronous consensus of Ren et al. [44] and thus tolerates 1/2 Byzantine parties.

At the end of each epoch, the shards are reconfigured according to the participants registered in the new reference block. Specifically, RapidChain uses a bounded version of Cuckoo rule [46]; the reconfiguration protocol adds a new party to a shard uniformly at random, and also moves a constant number of parties from each shard and assigns them to other shards uniformly at random.

Consensus: In each round, each shard randomly picks a leader. The leader creates a block, gossips the block header H (containing the round and the Merkle root) to the members of the shard, and initiates the consensus protocol on H. The consensus protocol consists of four rounds: (1) The leader gossips (Hpropose), (2) All parties gossip the received header (Hecho), (3) The honest parties that received at least two echoes containing a different header gossip \((H',pending)\), where \(H'\) contains the null Merkle root and the round, (4) Upon receiving \(\frac{nf}{m}+1\) echos of the same and only header, an honest party gossips (Haccept) along with the received echoes. To increase the transaction throughput, RapidChain allows new leaders to propose new blocks even if the previous block is not yet accepted by all honest parties.

StatePartition: Each shard handles a fraction of the transactions, assigned based on the prefix of the transaction ID.

CrossShard: For each cross-shard transaction, the leader of the output shard creates one “dummy” transaction for each input UTXO in order to move the transactions’ inputs to the output shard, and execute the transaction within the shard. To be specific, assume we have a transaction with two inputs \(I_1,I_2\) and one output O. The leader of the output shard creates three new transactions: \(tx_1\) with input \(I_1\) and output \(I'_1\), where \(I'_1\) holds the same amount of money with \(I_1\) and belongs to the output shard. \(tx_2\) is created similarly. \(tx_3\) with inputs \(I'_1\) and \(I'_2\) and output O. Then the leader sends \(tx_1, tx_2\) to the input shards respectively. In principle, the output shard is claiming to be a trusted channel [6] (which is guaranteed from the assignment), hence the input shards should transfer their assets there and then execute the transaction atomically inside the output shard (or abort by returning their assets back to the input shards).

Sybil: A party can only participate in an epoch if it solves a PoW puzzle with the previous epoch’s randomness, submit the solution to the reference committee, and consequently be included in the next reference block. The reference block contains the active parties’ identities for the next epoch, their shard assignment, and the next epoch’s randomness, and is broadcast by the reference committee at the end of each epoch.

Divide2Shards: During bootstrapping, the parties are partitioned independently and uniformly at random in groups of size \(O(\sqrt{n})\) with a deterministic random process. Then, each group runs the DRG protocol and creates a (local) random seed. Every node in the group computes the hash of the random seed and its public key. The e (small constant) smallest tickets are elected from each group and gossiped to the other groups, along with at least half the signatures of the group. These elected parties are the root group. The root group then selects the reference committee of size \(O(\log n)\), which in turn partitions the parties randomly into shards as follows: each party is mapped to a random position in [0, 1) using a hash function. Then, the range [0, 1) is partitioned into k regions, where k is constant. A shard is the group of parties assigned to \(O(\log n)\) regions.

During epoch transition, a constant number of parties can join (or leave) the system. This process is handled by the reference committee which determines the next epoch’s shard assignment, given the set of active parties for the epoch. The reference committee divides the shards into two groups based on each shard’s number of active parties in the previous epoch: group A contains the m/2 larger in size shards, while the rest comprise group I. Every new node is assigned uniformly at random to a shard in A. Then, a constant number of parties is evicted from each shard and assigned uniformly at random in a shard in I.

DRG: RapidChain uses Feldman’s verifiable secret sharing [22] to distributively generate unbiased randomness. At the end of each epoch, the reference committee executes a distributed randomness generation (DRG) protocol to provide the random seed of the next epoch. The same DRG protocol is also executed during bootstrapping to create the root group.

CompactState: No protocol for compaction of the state is used.

Analysis. RapidChain does not maintain a robust sharded transaction ledger under our security model since it assumes a weaker adversary. To fairly evaluate the protocol, we weaken our security model. First, assume the adversary cannot change more than a constant number of Byzantine parties during an epoch transition, which we term constant-adaptive adversary. In general, we assume bounded epoch transitions, i. e., at most a constant number of leave/join requests during each transition. Furthermore, the number of epochs is asymptotically less than polynomial to the number of parties. In this weaker security model, we prove RapidChain maintains a robust sharded transaction ledger, and provide an upper bound on the throughput factor of the protocol.

Note that in cross-shard transactions, the “dummy” transactions that are committed in the shards’ ledgers as valid, spend UTXOs that are not signed by the corresponding users. Instead, the original transaction, signed by the users, is provided to the shards to verify the validity of the “dummy” transactions. Hence, the transaction validation rules change. Furthermore, the protocol that handles cross-shard transactions has no proof of security against Byzantine leaders. For analysis purposes, we assume the following holds:

Assumption 43

CrossShard satisfies safety even under a Byzantine leader (of the output shard).

Lemma 44

The communication factor of DRG is O(n/m).

Proof

The DRG protocol is executed by the final committee once each epoch. The size of the final committee is \(O(n/m)=O(\log n)\). The communication complexity of the DRG protocol is quadratic to the number of parties [22]. Thus, the communication factor is O(n/m).    \(\square \)

Lemma 45

In each epoch, all shards are \(\frac{1}{2}\)-honest for \(m\le f(n)\) with f(n) from Corollary 21.

Proof

During the bootstrapping process of RapidChain (first epoch), the n parties are partitioned independently and uniformly at random into m shards [22]. For \(p=1/3\), the shards are \(\frac{1}{2}\)-honest only if \(m\le f(n)\) with f(n) from corollary 21. At any time during the protocol, all shards remain \(\frac{1}{2}\)-honest ( [55], Theorem 5). Hence, the statement holds after each epoch transition, as long as the number of epochs is o(n).    \(\square \)

Lemma 46

In each epoch, the expected size of each shard is O(n/m).

Proof

During the bootstrapping process of RapidChain (first epoch), the n parties are partitioned independently and uniformly at random into m shards [22]. The expected shard size in the first epoch is n/m. Furthermore, during epoch transition the shards remain “balanced” (Theorem 5 [55]), i. e., the size of each shard is O(n/m).    \(\square \)

Theorem 47

RapidChain satisfies persistence in our system model for constant-adaptive adversaries with \(f<n/3\) and bounded epoch transitions.

Proof

The consensus protocol in RapidChain achieves safety if the shard has no more than \(t<1/2\) fraction of Byzantine parties ([55], Theorem 2). Hence, the statement follows from Lemma 45.    \(\square \)

Theorem 48

RapidChain satisfies liveness in our system model for constant-adaptive adversaries with \(f<n/3\) and bounded epoch transitions.

Proof

To estimate the liveness of RapidChain, we need to examine the following subprotocols: (i) Consensus, (ii) CrossShard, (iii) DRG, and (iv) Divide2Shards.

The consensus protocol in RapidChain achieves liveness if the shard has less than \(\frac{n}{2m}\) Byzantine parties (Theorem 3 [55]). Thus, liveness is guaranteed during Consensus (Lemma 45).

Furthermore, the final committee is \(\frac{1}{2}\)-honest with high probability. Hence, the final committee will route each transaction to the corresponding output shard. We assume transactions will reach all relevant honest parties via a gossip protocol. RapidChain employs IDA-gossip protocol, which guarantees message delivery to all honest parties (Lemma 1 and Lemma 2 [55]). From Assumption 43, the protocol that handles cross-shard transactions satisfies safety even under a Byzantine leader. Hence, all “dummy” transactions will be created and eventually delivered. Since the consensus protocol within each shard satisfies liveness, the “dummy” transactions of the input shards will become stable. Consequently, the “dummy” transaction of the output shard will become valid and eventually stable (consensus liveness). Thus, CrossShard satisfies liveness.

During epoch transition, DRG satisfies liveness [22]. Moreover, Divide2Shards allows only for a constant number of leave/join/move operations and thus terminates in a constant number of rounds.    \(\square \)

Theorem 49

RapidChain satisfies consistency in our system model for constant-adaptive adversaries with \(f<n/3\) and bounded epoch transitions.

Proof

In every epoch, each shard is \(\frac{1}{2}\)-honest; hence, the adversary cannot double-spend and consistency is satisfied.

Nevertheless, to prove consistency is satisfied across shards, we need to prove that cross-shard transactions are atomic. CrossShard in RapidChain ensures that the “dummy” transaction of the output shard becomes valid only if all “dummy” transactions are stable in the input shards. If a “dummy” transaction of an input shard is rejected, the “dummy” transaction of the output shard will not be executed, and all the accepted “dummy” transactions will just transfer the value of the input UTXOs to other UTXOs that belong to the output shard. This holds because the protocol satisfies safety even under a Byzantine leader (Assumption 43).

Lastly, the adversary cannot revert the chain of a shard and double-spend an input of the cross-shard transaction after the transaction is accepted in all relevant shards because consistency with each shard and persistence (Theorem 35) hold. Suppose persistence holds with probability p. Then, the probability the adversary breaks consistency in a cross-shard transaction is the probability of successfully double-spending in one of the relevant to the transaction shards, hence \(1-p^v\) where v is the average size of transactions. Since v is constant, consistency holds with high probability, given persistence holds with high probability.    \(\square \)

Similarly to OmniLedger, to calculate the scaling factor of RapidChain, we need to evaluate the following protocols of the system: (i) Consensus, (ii) CrossShard, (iii) DRG, and (iv) Divide2Shards.

Lemma 50

The scaling factors of Consensus are \(\omega _m=O(\frac{n}{m})\), \(\omega _s=O(\frac{1}{m})\), and \(\omega _c=O(\frac{n}{m})\).

Proof

From Lemma 46, the expected number of parties in a shard is O(n/m). The consensus protocol of RapidChain has quadratic to the number of parties’ communication complexity. Hence, the communication factor Consensus is \(O(\frac{n}{m})\). The verification complexity (computational factor) collapses to the communication complexity. The space factor is \(O(\frac{1}{m})\), as each party maintains the ledger of the assigned shard for the epoch.

Lemma 51

The communication and computational factors of CrossShard are both \(\omega _m=\omega _c=O(v\frac{n}{m})\), where v is the average size of transactions.

Proof

During the execution of the protocol, the interaction between the input and output shards is limited to the leader, who creates and routes the “dummy” transactions. Hence, the communication complexity of the protocol is dominated by the consensus within the shards. For an average size of transactions v, the communication factor is \(O(vn/m+v)=O(vn/m)\) (Lemma 46). Note that this bound holds for the worst case, where transactions have \(v-1\) inputs and a single output while all UTXOs belong to different shards.

For each cross-shard transaction, each party of the input and output shards queries the verification oracle once. Hence, the computational factor is O(vn/m). The protocol does not require any verification across shards, thus the only storage requirement per party is to maintain the ledger of its own shard.    \(\square \)

Lemma 52

The communication factor of Divide2Shards is \(O(\frac{R\cdot n}{m^2})\).

Proof

The number of join/leave and move operations is constant per epoch, denoted by k. Further, each shard is \(\frac{1}{2}\)-honest (Lemma 45) and has size \(O(\frac{n}{m})\) (Lemma 46); these guarantees hold as long as the number of epochs is o(n).

Each party changing shards receives the new shard’s ledger of size T/m by O(n/m) parties in the new shard. Thus the total communication complexity at this stage is \(O(\frac{T}{m}\cdot \frac{n}{m})\), hence the communication factor is \(O(\frac{T}{m^2})=O(\frac{R\cdot e}{m^2})\), where R is the number of rounds in each epoch and e the number of epochs since genesis. Since \(e=o(n)\), the communication factor is \(O(\frac{R\cdot n}{m^2})\).

   \(\square \)

Theorem 53

RapidChain satisfies scalability in our system model for constant-adaptive adversaries with \(f<n/3\) and bounded epoch transitions, with communication and computational factor O(n/m) and space factor O(1/m), where \(n=O(m \log m)\), assuming epoch size \(R=O(m)\).

Proof

Consensus has on expectation communication and computational factors bounded by O(n/m) and space factor O(1/m) (Lemma 50). These bounds are similar in CrossShard where the communication and computational factors are bounded by O(vn/m) (Lemma 51), where v is constant (see Sect. 3).

During epoch transitions, the communication factor dominates: In DRG \(\omega _m=O(\frac{n}{m})\) (Lemma 44) while in Divide2Shards \(\omega _m=O(\frac{n\cdot R}{m^2})\) (Lemma 52). Thus for \(R=O(m)\), the communication factor during epoch transitions is O(n/m).

Overall, RapidChain ’s expected scaling factors are as follows: \( \omega _m=\omega _c= O(n/m)=O(\log m)\) and \(\omega _s=O(1/m)\), where the equation holds for \(n=c'm \log m\) (Lemma 45).

Theorem 54

In RapidChain, the throughput factor is \(\sigma =\mu \cdot \tau \cdot \dfrac{m}{v} < \frac{\mu \cdot \tau \cdot f(n)}{v}\) with \(f(n) = \frac{n}{c'\log (\frac{n}{c'\log (n)})}\) with \(c' = \frac{c}{p}\) and constant c from Corollary 20.

Proof

At most v shards are affected per transaction – when each transaction has \(v-1\) inputs and one output, and all belong to different shards. Therefore, \(m'<m/v\). From Lemma 19 and Corollary 21, \( m < f(n)\). Therefore, \(\sigma < \frac{\mu \cdot \tau \cdot f(n)}{v}\).

   \(\square \)

In RapidChain, the consensus protocol is synchronous and thus not practical. We estimate the throughput factor irrespective of the chosen consensus, to provide a fair comparison to other protocols. We notice that both RapidChain and OmniLedger have the same throughout factor when v is constant.

We provide an example of the throughput factor in case the employed consensus is the one suggested in RapidChain. In this case, we have \(a=1/2\), \(p=1/4\) (hence \(p/a=2/3\)), \(\mu <1/2\) (Theorem 1 [55]), and \(\tau =1/8\) (4 rounds are needed to reach consensus for an honest leader, and the leader will be honest every two rounds on expectation [54].). Note that \(\tau \) can be improved by allowing the next leader to propose a block even if the previous block is not yet accepted by all honest parties; however, we do not consider this improvement. Because of the values of p and a we can compute \(c\simeq 2.6\), thus \(c'\simeq 10.4\). Hence, for \(v=5\), we have throughput factor:

$$\begin{aligned} \sigma < \frac{1}{2} \cdot \frac{1}{8} \cdot \frac{1}{5} \cdot \frac{1}{10.4} \frac{n}{\log (\frac{n}{10.4\log n})} =\frac{n}{832 \log (\frac{n}{10.4\log n})} \end{aligned}$$

1.5 C.5 Chainspace

Chainspace is a sharding protocol introduced by Al-Bassam et al. [3] that operates in the permissioned setting. The main innovation of Chainspace is on the application layer. Specifically, Chainspace presents a sharded, UTXO-based distributed ledger that supports smart contracts. Furthermore, limited privacy is enabled by offloading computation to the clients, who need to only publicly provide zero-knowledge proofs that their computation is correct. Chainspace focuses on specific aspects of sharding; epoch transition or reconfiguration of the protocol is not addressed. Nevertheless, the cross-shard communication protocol, namely S-BAC, is of interest as a building block to secure sharding.

S-BAC Protocol. S-BAC is a shard-led cross-shard atomic commit protocol used in Chainspace. In S-BAC, the client submits a transaction to the input shards. Each shard internally runs a BFT protocol to tentatively decide whether to accept or abort the transaction locally and broadcasts its local decision to other shards that take part in the transaction. If the transaction fails locally (e.g., is a double-spend), then the shard generates pre-abort(T), whereas if the transaction succeeds locally the shard generates pre-accept(T) and changes the state of the input to ‘locked’. After a shard decides to pre-commit(T), it waits to collect responses from other participating shards, and commits the transaction if all shards respond with pre-accept(T), or aborts the transaction if at least one shard announces pre-abort(T). Once the shards decide, they send their decision (accept(T) or abort(T)) to the client and the output shards. If the decision is accept(T), the output shards generate new ‘active’ objects and the input shards change the input objects to ‘inactive’. If an input shard’s decision is abort(T), all input shards unlock the input objects by changing their state to ‘active’.

S-BAC, just like Atomix, is susceptible to replay attacks [48]. To address this problem, sequence numbers are added to the transactions, and output shards generate dummy objects during the first phase (pre-commit, pre-abort). More details and security proofs can be found on [48], as well as a hybrid of Atomix and S-BAC called Byzcuit.

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Avarikioti, Z., Desjardins, A., Kokoris-Kogias, L., Wattenhofer, R. (2023). Divide & Scale: Formalization and Roadmap to Robust Sharding. In: Rajsbaum, S., Balliu, A., Daymude, J.J., Olivetti, D. (eds) Structural Information and Communication Complexity. SIROCCO 2023. Lecture Notes in Computer Science, vol 13892. Springer, Cham. https://doi.org/10.1007/978-3-031-32733-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-32733-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-32732-2

  • Online ISBN: 978-3-031-32733-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics