On a decentralized trustless pseudo-random number generation algorithm

Serguei Popov

doi:10.1515/jmc-2016-0019

Open Access Published by De Gruyter January 10, 2017

On a decentralized trustless pseudo-random number generation algorithm

Serguei Popov

From the journal Journal of Mathematical Cryptology

https://doi.org/10.1515/jmc-2016-0019

Abstract

We construct an algorithm that permits a large group of individuals to reach consensus on a random number, without having to rely on any third parties. The algorithm works with high probability if there are less than 50 of colluding parties in the group. We describe also some modifications and generalizations of the algorithm.

Keywords: Public randomness; collusion; algorithm works w.h.p.

MSC 2010: 68W15; 90B18; 65C10; 60-04; 65C05

1 Introduction and description of the basic algorithm

Suppose that there is a (large) collection of n individuals which want to reach consensus on a random number s∈{0,1}N, but, in general, trust neither each other, nor any third parties. The outcome should have uniform distribution, be unpredictable for everybody until revealed, and the whole procedure should be transparent/verifiable to both participants and outsiders.

Assume also that, among them, there are pn colluding parties that want to mess with the procedure (e.g., force the “random” outcome to have a specific value, or at least make it biased), where p∈(0,1). We suppose that they can exchange information freely and secretly from the others, and can agree on a common strategy. Also, we make a “worst-case” assumption that, on any stage of the algorithm, the colluding parties may first wait for the others’ actions, and then choose what to do according to the information they currently have.

So, is it possible to do this random number generation under some reasonable assumption on the number of colluding parties? The first solution that comes to mind is to ask the parties to choose their random numbers, reveal all them at the same time, and then just apply the exclusive “or” (XOR) operation to all these numbers. This, however, has a drawback that, in practice, one cannot enforce that this revelation act happens precisely at the same moment for all parties involved. The last one to reveal has, in fact, total control on the outcome, which can be exploited by a malicious player. Even if we make the parties commit on their numbers, the last one still has an option of never revealing it, thus effectively introducing a bias to the final result. Therefore, more complicated procedures have to be considered. One possible approach is to introduce security deposits (see the discussion in [8]), where each party makes one before the procedure, and does not receive it back in case it does not reveal its secret number. There is, however, a problem with this approach. If the deposits are small, then this will not prevent cheating in case the stakes are high; if the deposits are large, people may not be willing to take part in such a procedure (“what if my connection goes down between 1st and 2nd stages?”)^[1]. Some solutions based on public ledgers (such as the Bitcoin blockchain) were proposed, see [1, 4]. An interesting algorithm was proposed in [3]: First, inputs from many different parties are collected, and the outcome is then computed as a deterministic function of them; this computation, however, is deliberately made so slow that no party is able to influence the result by submitting an entry at the last moment. The paper [7] mentions a procedure that apparently has some similarities with the one of the present paper, but without giving many details. We refer to the aforementioned papers for more discussion on public randomness and related topics.

In the sequel, we introduce an algorithm that permits to achieve this goal with high probability. In comparison to that of [3], it does not require heavy computations, and so it may work for parties with low-end devices. Hopefully, this algorithm is well tailored to work in a blockchain-based environment, with a lot of independent parties which do not trust each other. Our approach can be described in the following way. First, each party secretly chooses its number from some high-entropy random source. Then, it should commit on that number, in some usual way, for instance, publishing its hash. In practice, when the random number generation has to be performed many times in a row, the parties may commit on a last number of a long hash chain (so that the previous number is at the same time a commitment on the next one). That is, each party then has options to reveal or withhold its number, but cannot lie about it.

Next, let us randomly divide the crowd into n/k groups of size k (for simplicity, we assume that n/k is integer). Such a division can be done in a decentralized and verifiable way that cannot be manipulated too much. For example (think about cryptocurrency accounts, such as Eth or Nxt), hash the public keys of the parties’ accounts with some (sufficiently random, but public) seed^[2] and order the results; then form the groups based on that order. Now, our basic algorithm is described in the following way. First, the secrets are shared within each group (so that any party that decides to withhold its number will be eliminated on this step), and then any representative of a group where this procedure did succeed (all members shared their numbers) reveals the numbers and everything is XORed together. More specifically, the following steps are carried out:

Members of each group share their numbers between them. One way to do this would be that a party encrypts its secret number using the public keys of the other k-1 members of its group separately, and then broadcasts the k-1 outcomes.
Then, the group publishes a statement like “I know the secrets of all my group” signed by everybody. It is unclear whether this step is strictly necessary, but it is probably not a bad idea to do this way to assure that the consensus on which groups successfully did the secrets’ sharing is reached.
Any group that did not publish such a statement is eliminated. In fact, any party can eliminate the group it belongs to by publishing a signed statement that the secret sharing did not succeed within that group^[3].
Then, at least one representative of each group that was not eliminated publishes all the numbers shared by the group members.
All the published numbers are XORed to obtain the final outcome.

Now, this algorithm indeed achieves our goals if

there is at least one group which contains only honest^[4] members (so that the colluding parties cannot know their numbers beforehand),
no group consists entirely of colluding parties (otherwise, such a group could introduce a bias to the final outcome by not revealing their numbers).

We denote the event from (a) by A, and the event from (b) by B. Also, let us introduce the events

Aj={all parties of jth group are honest},

Bj={at least one party of jth group is honest}

for j=1,…,n/k, so that A=⋃j=1n/kAj and B=⋂j=1n/kBj.

Now, we need to choose k in such a way that ℙ⁢[A∩B] is close to 1. Let us try to set k=c⁢ln⁡n, where c>0 is a parameter^[5], and estimate the probabilities of the above events. Also, instead of fixing the number of colluding parties, we rather just assume that each party is malicious with probability p, independently of the others, and all the malicious parties collude. Clearly, the situation remains essentially the same, but the calculations become much easier.

First, we clearly have ℙ⁢[Aj]=(1-p)k=(1-p)c⁢ln⁡n=n-cln(1-p)-1, so we can write

ℙ⁢[A]=1-ℙ⁢[⋂j=1n/kAj∁]

(1)=1-(1-(1-p)k)n/k

=1-(1-n-cln(1-p)-1)nc⁢ln⁡n

(2)≃1-exp⁡(-n1-cln(1-p)-1c⁢ln⁡n),

and the value of the above expression is quite close to 1 if cln(1-p)-1<1 and n is large enough.

Analogously, we have that ℙ⁢[Bj]=1-pk=1-n-c⁢ln⁡p-1, so

ℙ⁢[B]=(ℙ⁢[B1])n/k

(3)=(1-pk)n/k

=(1-n-c⁢ln⁡p-1)nc⁢ln⁡n

≃exp⁡(-n-c⁢ln⁡p-1+1c⁢ln⁡n)

(4)≃1-n-c⁢ln⁡p-1+1c⁢ln⁡n

if c⁢ln⁡p-1>1. That is, we obtain that (at least for large n) c must satisfy the following condition:

(5)1ln⁡p-1<c<1ln(1-p)-1,

which is clearly possible if and only if p<0.5. So, we have just proved the following result.

Proposition 1

Assume that p<0.5. Then

(6)ℙ⁢[A∩B]≳1-exp⁡(-n1-cln(1-p)-1c⁢ln⁡n)-n-c⁢ln⁡p-1+1c⁢ln⁡n.

Now, since p is generally unknown, we need to find a way to make a choice for k=k⁢(n) (that is, we need to chose c=kln⁡n) that does not depend on p. One possible way to do this would be by fixing a “reasonable” value of p<0.5 (e.g., p=0.1), and then find the value of c that maximizes the expression in the right-hand side of (6). However, this may not be the best solution, due to the following reason: if the event A∁ occurs, this is much worse (from the point of view of honest parties) than the occurrence of B∁. Indeed, if there is a group that consists entirely of colluding parties (i.e., B∁ occurs), then on the last stage of the procedure they have an option of not revealing their numbers at all (after waiting all others to reveal), thus influencing the final outcome. However, the bias introduced this way is typically not so strong (there are only two options for the group, reveal or not reveal), and, perhaps more importantly, the act of not revealing their numbers is practically a confession “we are all malicious”^[6]. In an ambiance where reputation (in any reasonable sense of this word) matters, such a thing could be quite harmful from the point of view of the colluding parties^[7].

On the other hand, on the event A∁, there will be at least one “spy” in each group, i.e., the colluders will know all secrets already on the first stage! Of course, this opens many more possible ways for cheating, e.g., the colluding parties may eliminate groups in any possible combinations, thus making quite broad adjustments to the final outcome, all this without raising a lot of suspect.

So, an adequate way for choosing k⁢(n) is rather the following. First, we fix some p<0.5 that we believe to be an upper bound on the proportion of colluding parties. Then, we decide on the acceptable values of α=ℙ⁢[A∁] and β=ℙ⁢[B∁], for instance, α=0.005 and β=0.05. If n is fixed, in general we can hope to control only one of the quantities α,β (observe that, when n is fixed and k increases, this causes α to increase and β to decrease). If one wants to control both quantities at once, one may need to increase n. See Table 1 for some numerical examples. Clearly, in practice it is better to use the exact formulas (1) and (3) for calculations.

Table 1

A few numerical examples.

p	n	k	α=ℙ⁢[A∁]	β=ℙ⁢[B∁]
0.1	60	6	51×10-5	10-5
0.1	60	5	22×10-6	12×10-5
0.1	60	3	46×10-13	198×10-4
0.2	60	6	478×10-4	64×10-5
0.2	60	5	853×10-5	383×10-5
0.2	60	3	58×10-8	148×10-3
0.2	120	8	63×10-3	38×10-6
0.2	120	6	228×10-5	128×10-5
0.2	120	5	72×10-6	765×10-5
0.3	120	8	41×10-2	98×10-5
0.3	120	6	81×10-3	145×10-4
0.3	120	5	121×10-4	567×10-4

Notice that (recall (5)) c=1ln⁡2 works for any p<0.5. In particular (recall (2) and (4)), the good news is that the decay of ℙ⁢[A∁] is much more rapid than the decay of ℙ⁢[B∁] (stretched exponential vs. polynomial). So, for given n, choosing

(7)k:=⌊ln⁡nln⁡2⌋=⌊log2⁡n⌋=max⁡{ℓ∈ℕ:2ℓ≤n}

is probably a good rule of thumb (⌊⋅⌋ stands for the lower integer part^[8], ⌊x⌋ is the largest integer not exceeding x). Observe that this rule gives k=5 for n=60 and k=6 for n=120, see Table 1.

Also, it should be observed that, although α and β can be made arbitrarily small for any p<0.5 (as we have just shown), the corresponding value of n can be quite large (if p is very close to 0.5).

2 Some modifications and generalizations

Consider the following situation. The overall proportion of colluding parties is small, but the nodes of the network are frequently offline, so, with high probability, no group consists entirely of active (i.e. not offline) parties. In this case, the algorithm of Section 1 will just halt, that is, will not produce any outcome. Even worse, for some values of the parameters it may happen that, with high probability, there are still some complete groups (i.e., everybody in the group is online), but each group does contain at least one malicious party (and, as discussed above, this means giving almost total control to the colluding parties)^[9].

Therefore, in some situations it may be impractical to only accept the complete groups. In this section, we consider some modifications of the previous algorithm that address this issue.

The initial setup is the same: n parties are divided into n/k groups of size k, and we assume that they all have committed on their secret numbers. Then, again, they attempt to share their numbers between them, but now there is no requirement that only complete groups pass to the next round; instead, we consider the group valid if the number of parties that shared their secrets^[10] is greater than k/2 (just to avoid dealing with several “conflicting” subgroups of the same group).

Then, this algorithm works fine if

there is at least one group such that more than k/2 of its members are online and all of them are honest,
no group contains more than k/2 malicious members.

Similarly, we denote the event from (a) by A~, and the event from (b) by B~. Also, let us introduce the events

A~j={all parties of jth group are honest and more than k/2 of them are online},

B~j={less than k/2 parties of jth group are malicious}

for j=1,…,n/k, so, as before, A~=⋃j=1n/kA~j and B~=⋂j=1n/kB~j. Again, we intend to choose k=c⁢ln⁡n in such a way that ℙ⁢[A~∩B~] is close to 1. We keep the assumption that p is the probability that a party is malicious, but we assume also that a honest party is offline with probability r>0 (that is, a party is malicious, honest but offline, honest and online with probabilities p, (1-p)⁢r, (1-p)⁢(1-r), correspondingly).

Let us denote by

Φ⁢(k,q,s)=∑ℓ<s(kℓ)⁢qℓ⁢(1-q)k-ℓ

the probability that the value of a binomial ℬ⁢(k,q) random variable is less than s. Now, let us recall the usual Chernoff’s bound for the binomial distribution^[11]: for any k and a with 0<a<q<1, we have

(8)Φ⁢(k,q,a⁢k)≤exp⁡(-k⁢H⁢(a,q)),

where

H⁢(a,q)=a⁢ln⁡aq+(1-a)⁢ln⁡1-a1-q>0.

Moreover, it is easy to see that Φ⁢(k,q,a⁢k) is close to 1 when a>q (in which case, one may write Φ⁢(k,q,a⁢k)=1-Φ⁢(k,1-q,(1-a)⁢k) and apply the above estimates).

We have

(9)ℙ[A~j]=ℙ[Aj]ℙ[more than k/2 are online∣Aj]=(1-p)kΦ(k,1-r,k/2).

Assume for simplicity that r<12. Then, the last term in the right-hand side of (9) is close to 1. So, in this case we have ℙ⁢[A~j]≃ℙ⁢[Aj], and therefore (2) still holds.

Next, we use (8) with k=c⁢ln⁡n, q=1-p and a=12 to obtain that (note that H⁢(a,q) then equals 12ln(4p(1-p))-1)

ℙ⁢[B~j]=1-Φ⁢(k,1-p,k/2)

≥1-exp(-clnn×12ln(4p(1-p))-1)

=1-n-c2ln(4p(1-p))-1,

ℙ⁢[B~]=(ℙ⁢[B~1])n/k

=(1-n-c2ln(4p(1-p))-1)nc⁢ln⁡n

≳exp⁡(-n1-c2ln(4p(1-p))-1c⁢ln⁡n)

≃1-n-c2ln(4p(1-p))-1+1c⁢ln⁡n

if c2ln(4p(1-p))-1>1. That is, we obtain that (for large n) c must satisfy the following condition:

(10)2ln(4p(1-p))-1<c<1ln(1-p)-1.

The left-hand side of (10) increases when p∈(0,12), and the right-hand side decreases; so, the solution exists for p<0.2 (if p=0.2, both terms become equal). So, we have just proved the following result.

Proposition 2

Assume that p<0.2. Then

ℙ⁢[A~∩B~]≳1-exp⁡(-n1-cln(1-p)-1c⁢ln⁡n)-n-c2ln(4p(1-p))-1+1c⁢ln⁡n.

Let us also briefly comment on the case r>12. The last term in the right-hand side of (9) then would also be polynomially small in n, and (8) can be used to estimate it (as we commented, the relation (8) is, in fact, “almost equality”, so it gives essentially the correct order of decay). In the same way, one can arrive to a modified version of (10), with (ln(1-p)-1+12ln(4r(1-r))-1)-1 in the right-hand side. This, in turn, leads to a more complicated existence condition involving p and r, which we prefer not to write in an explicit way.

Next, as in Section 1, we can argue that A~∁ is much worse than B~∁. Also, all the past discussion about how to choose n and k remains valid. Observe, however, that the algorithm we just considered is less “robust” than the one of the previous section, since it can fence off at most 20% of colluding parties (vs. formerly 50%). Let us now briefly mention some further modifications/generalizations of the algorithm, that aim to increase its robustness. We do not present any further computations; we hope that the reader agrees that the corresponding asymptotic analysis (as n→∞) can be done in the same way as above.

So, first, we may consider a subgroup (where all secrets were shared) valid if there are at least γ⁢k members, where γ∈(0,1) (for γ∈(0,12) we need also to introduce some rules about which subgroup of a given group should win if there are several of them). One can obtain that for γ∈(12,1) the algorithm becomes resistant against a proportion pγ of colluding parties, where 0.2<pγ<0.5.

Also, on the second stage, before XORing all the revealed numbers, we may first eliminate, say, some fixed proportion of the lower-sized valid subgroups. This gives some additional chances to get rid of all-malicious subgroups, since those must typically be of smaller size.

Another possibly useful observation is the following. Notice that it may be impractical to deal with very large number of parties, due to, e.g., the connection/synchronization issues. However, we can take a larger crowd first, and then choose a proportion of it at random. Thus, if some party wants to mess with this, it would need to bribe really a lot of other parties.

Conclusions

We presented an algorithm that, with high probability, allows a large number of parties to agree on a random number in a decentralized and trustless way.
Our basic algorithm is described in the following way. First, each party chooses its secret number from some high-entropy source of randomness, and commits on it. Then, we formed groups (of equal sizes) of parties, and the secrets are shared within each group (so that any party that decides to withhold its number will be eliminated on this step). Next, any representative of a group where this procedure did succeed (i.e., all members shared their secret numbers) reveals the numbers and everything is XORed together.
Under the assumption that the proportion of colluding parties does not exceed 50%, it is possible to show that the group size can be chosen in such a way that, with high probability, there is at least one group consisting entirely of honest parties, and no group consists entirely of colluders. This ensures that the algorithm works as intended.
In practice, there can be no “universal” rule on how to choose n and k=k⁢(n), but a good rule of thumb is choosing n to be as large as possible, and k=⌊log2⁡n⌋, as in (7).
We analyzed also a modification of the above algorithm, where the requirement that all members of the group must share their secrets is relaxed. This may be useful when dealing with situations when honest parties are frequently offline. We also proposed some further modifications and generalizations.

Communicated by Ed Dawson

References

[1] Bonneau J., Clark J. and Goldfeder S., On Bitcoin as a public randomness source, preprint 2015, https://eprint.iacr.org/2015/1015. Search in Google Scholar

[2] Dembo A. and Zeitouni O., Large Deviations Techniques and Applications, Springer, Berlin, 2010. 10.1007/978-3-642-03311-7Search in Google Scholar

[3] Lenstra A. K. and Wesolowski B., Trustworthy public randomness with sloth, unicorn, and trx, preprint 2015, http://eprint.iacr.org/2015/366.pdf; to appear in Int. J. Appl. Cryptogr. 10.1504/IJACT.2017.089354Search in Google Scholar

[4] Pierrot C. and Wesolowski B., Malleability of the blockchain’s entropy, preprint 2016, https://eprint.iacr.org/2016/370.pdf. 10.1007/s12095-017-0264-3Search in Google Scholar

[5] Ross S. M., A First Course in Probability, 8th ed., Prentice-Hall, Upper Saddle River, 2009. Search in Google Scholar

[6] Shiryaev A. N., Probability, Springer, New York, 1996. 10.1007/978-1-4757-2539-1Search in Google Scholar

[7] Syta E., Tamas I., Visher D., Wolinsky D. I., Gasser L., Gailly N. and Ford B., Keeping authorities “honest or bust” with decentralized witness cosigning, preprint 2015, https://arxiv.org/abs/1503.08768v2. 10.1109/SP.2016.38Search in Google Scholar

[8] How can I securely generate a random number in my smart contract?, http://ethereum.stackexchange.com/questions/191/how-can-i-securely-generate-a-random-number-in-my-smart-contract. Search in Google Scholar

Received: 2016-4-4

Revised: 2016-10-9

Accepted: 2016-11-30

Published Online: 2017-1-10

Published in Print: 2017-3-1

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

On a decentralized trustless pseudo-random number generation algorithm

Abstract

1 Introduction and description of the basic algorithm

2 Some modifications and generalizations

Conclusions

References

Journal and Issue

Articles in the same Issue