Safety

The safety property is a statement about how long the transaction has to be on the selected chain before we have our confidence that a transaction won't revert is at least ε\varepsilon. That is, with the approval time S(α,ε)S(\alpha,\varepsilon).

We already agreed there is no hope to defend against a 12\frac{1}{2}-attacker, so we assume α<12\alpha < \frac{1}{2}.

For such α\alpha, we want the confidence to grow exponentially fast. In other words, if we make ε\varepsilon twice as small, we don't want S(α,ε)S(\alpha,\varepsilon) to be twice as large. In fact, we want something much stronger: that no matter how many time we halve ε\varepsilon, each time would add a constant amount of time. If you have some background (say, from reading about logarithms in the appendix), you know that such a relationship between SS and ε\varepsilon can be expressed asymptotically by the following equation:

S(α,ε)=O(log(1/ε))S(\alpha,\varepsilon) = O\left(\log(1/\varepsilon)\right)

This motivates the following:

Definition? The protocol is safe if for any α<1/2\alpha < 1/2 it holds that S(α,ε)=O(log(1/ε))S(\alpha,\varepsilon) = O\left(\log(1/\varepsilon)\right)

The problem with this definition is that, as it turns out, it is impossible to satisfy.

This definition is impossible to satisfy, yet DAGKnight does satisfy it. How is it possible?

Being parameterless, it could adjust to varying network conditions and thus protect against any α\alpha-adversary as long as α<12\alpha<\frac{1}{2} (although the confirmation times shoot exponentially fast through the roof as α\alpha approaches 12\frac{1}{2}).

When we say no protocol can satisfy this property, this only holds for protocols that can be analyzed on our model. DAGKnight cannot be, because we assume a fixed latency bound, while DAGKnight, in some sense, responds to varying delays, even if they grow longer than our assumed DD. If we tried to analyze DAGKnight in this model, we would have to cap off its adjustment ability (thus actually analyzing a protocol that is weaker than DAGKnight), and the "unhandled cases" will slightly decrease the size of attackers we can handle.

Before we understand how to correct the definition, we discuss a magical world where it can be satisfied, and is actually satisfied by Bitcoin.

Imagine a World With No Orphans

Consider how Bitcoin works under the following unrealistic assumption: the honest network never creates orphan blocks. That is, all honest blocks to ever have and will be created are arranged in a chain.

Hence, even when attacked, the network looks something like this:

where txtx is the transaction the adversary tries to double-spend and txtx' is a conflicting transaction.

We expect the honest network to create 1α1-\alpha blocks for any α\alpha blocks created by the attacker. In other words, the honest network createes blocks 1αα\frac{1-\alpha}{\alpha} faster. No matter what α\alpha is, as long as it is smaller than 12\frac{1}{2}, the gap between the honest and adversarial chain will increase. And, as we reason later, the probability that the adversary reverts the chain decreases exponentially with this gap.

Orphans and the Scaling Problem

In reality, orphans do exist in the honest network. However, recall that we assumed that the adversary can arrange their blocks in a perfect chain. This means that the attacker can revert the chain with less than half the hashing power. Say that the honest network orphans one in three honest blocks, and that α=40%\alpha = 40\%. We find that the honest network creates 60%60\% of the blocks, but a third of them goes to waste, so only two thirds of that, namely 40%40\% of all the blocks, are honest network blocks. This means that although α<12\alpha < \frac{1}{2}, the adversary and honest network are neck to neck. If we slightly increase the adversary fraction to 41%41\%, the attack will almost certainly succeed.

More generally, say that a fraction of rr of the honest blocks is orphaned. For any α\alpha blocks created by the adversary, the honest network creates 1α1-\alpha blocks, but only (1r)(1α)(1-r)(1-\alpha) will be honest chain blocks:

Hence, the network has no hope against an α\alpha-adversary unless (1r)(1α)>α(1-r)(1-\alpha) > \alpha. Note that the no-orphans case described above is exactly when r=0r=0, and we get the usual equation 1α>α1-\alpha > \alpha, or α<12\alpha < \frac{1}{2}.

If we solve the more general equation for α\alpha we get the following condition:

α<12(1r2r)\alpha < \frac{1}{2}\left(1-\frac{r}{2-r}\right)

You can check that the right-hand side is 12\frac{1}{2} for r=0r=0, and goes to 00 as rr approaches 11, let us plot it:

Strictly speaking, we conclude that Bitcoin could never satisfy the safety definition we proposed above. The orphan rate is always positive. The Bitcoin orphan rate is estimated to be at most one in 150150, so if we plug r=1/150r=1/150 into our formula, we get that Bitcoin is "only" safe against adversaries with at most 49.999%49.999\% of the hashrate.

This almost makes it seem as though there are no repercussions to this entire orphan thing, and we can just ignore it. But that's wrong. Bitcoin's very low throughput is the price Satoshi had to pay to obtain such a small orphan rate.

More generally, rr is mostly determined by two quantities: the block delay λ\lambda, chosen by the protocol designer, and the network latency DD, which is also somewhat determined by the protocol designer. "Huh?" you ask, how can the protocol designer affect the network latency? By choosing the block size! Larger blocks take longer to traverse the network. This is not because they contain more data, that actually has negligible effect. It is actually because the data needs to be verified by each node, and the verification process typically grows linearly with the size of the block (verifying 40004000 transaction inputs will take about four times longer than verifying 10001000 transaction inputs).

To ensure a low orphan rate, Satoshi suggested to parameterize Bitcoin such that λD\lambda \gg D. It is this requirement that protects us from high orphan rates degrading the security. It is this requirement that is more commonly known as "the Bitcoin scaling problem".

The Math of Orphans*

It is common to wonder wether a ten-minute block delay is an exaggeration. Won't we be fine if we decrease it to, say 3030 seconds?

To better understand the consequences, we will quickly analyze the relationship between λ\lambda, rr, and DD.

A necessary (and almost sufficient) condition for an orphan to be created is that more than one block is created within a period of DD. As explained in the appendix, the number of blocks created in this period ditributes as Poi(D/λ)Poi(D/\lambda). Using the approximation ex1xe^{-x}\approx 1-x

we can compute that

P[Poi(Dλ)2]=1P[Poi(Dλ)<2]=1P[Poi(Dλ)=0]P[Poi(Dλ)=1]=1(D/λ)0eD/λ0!(D/λ)1eD/λ1!=1eD/λDλeD/λ=1eD/λ(1+Dλ)1(1Dλ)(1+Dλ)=(D/λ)2\begin{aligned}\mathbb{P}\left[Poi\left(\frac{D}{\lambda}\right)\ge2\right] & =1-\mathbb{P}\left[Poi\left(\frac{D}{\lambda}\right)<2\right]\\& =1-\mathbb{P}\left[Poi\left(\frac{D}{\lambda}\right)=0\right]-\mathbb{P}\left[Poi\left(\frac{D}{\lambda}\right)=1\right]\\& =1-\frac{\left(D/\lambda\right)^{0}\cdot e^{-D/\lambda}}{0!}-\frac{\left(D/\lambda\right)^{1}\cdot e^{-D/\lambda}}{1!}\\& =1-e^{-D/\lambda}-\frac{D}{\lambda}\cdot e^{-D/\lambda}\\& =1-e^{-D/\lambda}\left(1+\frac{D}{\lambda}\right)\\& \approx1-\left(1-\frac{D}{\lambda}\right)\left(1+\frac{D}{\lambda}\right)\\& =\left(D/\lambda\right)^{2}\end{aligned}

So we expect at least (D/λ)2(D/\lambda)^2 orphans per network delay. And since there are λ/D\lambda/D block delays in a network delay, we expect at least (D/λ)(D/\lambda) orphans in a block delay.

The approximation ex1xe^{-x}\approx 1-x is only accurate for very small values of xx. That is, under the assumption that λD\lambda \gg D. But we can clearly see that in this regime the orphan rate is inversely proportional to the block delay. That is, by making the block delay twice shorter, we double the rate of orphans, and so on. In particular, decreasing the Bitcoin block delay to, say 3030 seconds, will cause at least 2020 as many orphans. Now instead of r=1/150r=1/150 we have r=2/15r=2/15, and putting it into the equation we get that the threshold for double-spend attacks reduced to α46.5%\alpha \approx 46.5\%.

That might not seem that much, but when we analyze confirmation times we will see this has very tangible bearings on how long it takes to confirm a transaction.

Also, keep in mind that we only computed a lower bound on the number of orphans, and a very rough one at that. We only considered cases where a single block is orphaned, and did not take into account the scenario that longer chains of two, three, or more blocks, are also orphaned. These events become increasingly likely as orphan rates increase, making non-negligible higher-order contributions to the orphan rates.

Digression: Orphans and Difficulty

It is a common misconception that “orphan rates decrease gains for miners”. This misunderstanding follows from the reasonable thought that if we have to throw blocks in the garbage then whoever mined this block is in the loss.

However, that is not quite the case. The difficulty adjustment algorithm only controls the issuance rate of non-orphan blocks. There is no other way: since difficulty has to be in consensus, it can only depend on blocks that are on the chain.

In particular, if we have, say, an orphan rate of 20%, then mining blocks will be 20% easier, increasing the block rates to compensate. Yes, one fifth of the blocks you make will go to the trash, but you will make 1.25 times more blocks, giving you the same rate of non-orphan blocks.

You might be tempted to think that a 20% orphan rate implies only 80% of the hash rate you see on the blockchain goes towards non-orphan blocks. So if the difficulty requires one tera hash per second to see a block every ten minutes, then an adversary will only need 800 giga hash to 51% attack the network. Again, this is not the case, the hash rate read off the difficulty only takes non-orphan blocks into account. This (combined with the fact that nodes do not broadcast blocks they see as orphans) makes measuring orphan rates in practice extremely difficult (especially if we consider that old orphans can be forged for cheap).

That being said, there are adverse consequences to high orphan rates (besides the wasted work): they introduce noise that increases confirmation times, and increase the advantage for miners with a better internet connection.

The Correct Definition

So if we want any hope for a security definition that is actually feasible, we need to accept the fact that our assumption on a fixed network delay comes with the price of having to choose some δ>0\delta > 0, and then set the parameters such that the network is secure assuming 12+δ\frac{1}{2} + \delta of the miners are honest.

One might be tempted to use the expression we derived for α\alpha in the definition, but that would be a mistake. This computation was specific to Bitcoin. Our definition needs to be general. In fact, it shouldn't use the word orphans at all, as this would be too presumptuous. Orphans are just one way that things can go wrong, but we need to be ready for anything.

The property of orphans that interests us is this: as D/λD/\lambda goes to zero, the orphan rate also goes to zero, and with it (though we have yet to prove it!), the required δ\delta. This is the magic property that we want.

This leads us to the following:

Definition: A blockchain protocol is δ\delta-safe if for any α<12δ\alpha < \frac{1}{2} - \delta it holds that S(α,ε)=O(log(1/ε))S(\alpha,\varepsilon) = O(\log(1/\varepsilon)). A protocol is safe if it is O(D/λ)O(D/\lambda)-safe.

For those not used to asymptotic notation, it might seem "complicated" to write something like "O(D/λ)O(D/\lambda)-safe", but it is actually a great example of how asymptotic notation can help us conveniently focus on what matters to us: that as D/λD/\lambda goes to 00, the value of δ\delta required so that we could be δ\delta-safe also goes to 00. The relationship between D/λD/\lambda and δ\delta could be arbitrarily complicated, but we don't mind. We only care that they vanish together.

For a bit more big o evangelism, let us rewrite this definition without using asymptotic notation:

A blockchain is δ\delta-safe if for any α<12δ\alpha < \frac{1}{2} - \delta it holds that there is some constant CC such that eCS(α,ε)<1/εe^{C\cdot S(\alpha,\varepsilon)} < 1/\varepsilon. It is safe if for any δ\delta we could choose DD and λ\lambda such that it is δ\delta-safe.

I think this mess makes a compelling case that even though asymptotic notation takes some getting used to, it allows us to neatly and elegantly discard irrelevant information and succinctly state the information that remains.

Last updated