Part I: Bitcoin

How Eyal and Sirer ruined Bitcoin forever (but not really)

As we will see, the rational strategy is still not good enough; among the rational strategies, there is a unique one that, if all miners follow, maximizes the performance of the protocol.In their paper, Eyal and Sirer present a strategy that increases the profit of a large miner by withholding blocks.

The idea is that by withholding the blocks they created, only releasing them at strategic times, a large miner can increase the orphan rates of the rest of the network more than its own orphan rates, causing their proportion of non-orphaned work to increase, whereby earning then a bigger cut of the reward than their contribution to mining.

Bitcoin and Selfishness

For those who want the details of the Eyal-Sirer attack, as well as an overview of the subsequent attacks and countermeasures, I recommend the relevant section of my book.

The upshot of the attack, in concrete number form, is summarized in this figure lifted from their paper:

(Fig 2. from Eyal-Sirer. The horizontal axis is the selfish miner hashrate fraction α\alpha, the vertical axis is the expected gain α(1+δ)\alpha(1+\delta))

γ\gamma is the tie-winning probability. Imagine that Alice withholds a block until she hears about a competing block from the honest network, and then she releases her withheld block. We let γ\gamma be the probability that the network prefers Alice's withheld block over the honest block. It is a measure of how "well connected" Alice is.

We see that even if we assume zero connectivity, an attacker with more than 33% can profitably carry out the attack, while as little as 42% suffices to create a majority of the blocks non-orphaned.

With perfect connectivity, 33% is already enough to create a majority of the non-orphaned blocks, and any perfectly connected attacker has something to gain from a selfish mining attack.

The most realistic scenario is that the attacker's connectivity parameter is somewhere in between, say halfway. In this case, we see that 25% is enough for a profitable attack, while 38% is enough to create a majority of non-orphaned blocks.

A common misconception is that because a 38% attacker can create a majority of the blocks, she can double-spend. It makes sense that by making more blocks than the honest network, you can create a competing chain, but that's not what you do in selfish mining. The total fraction of blocks remains the same, but you constantly headbutt with honest blocks to kick them off the chain. You can only do this by pointing at, or near, the heaviest tip, which is impossible if you spend your efforts on mining a side chain.

A successful double-spend attack looks like this:

whereas a successful selfish mining attack looks like this:

Note that in the latter example, the red blocks constitute the majority of chain blocks, but the blue blocks are still the majority of all blocks. That is, judging from the chain alone it seems like the red miner has a majority of the hashrate, but the orphans reveal that this is not quite the case. In this scenario, the attacker will receive a majority of the rewards despite not mining a majority of the block, which is exactly the crux of selfish mining attacks.

Chain Quality (and Lack Thereof)

A key nuance in the discussion above is that we don't like selfish mining, but it doesn't allow a double-spend attack. In other words, it is a behavior that is considered adversarial, but is not captured in the colloquial definition of what makes a blockchain secure.

It is natural to define a security notion that captures selfish mining attacks. That is exactly what Garay, Kiayias, and Leonardos did when they described the Bitcoin backbone protocol, a version of Bitcoin that is simplified enough for mathematical analysis, but more accurate than previously used models.

In that paper, they introduce the chain quality property. The precise definition is a bit technical, but the gist is this:

Definition (reproduced from Definition 4 in GKL, simplified, informal): A blockchain has ideal chain-quality if the amount of non-orphaned blocks produced by a miner with a fraction α\alpha of the global hashrate gets closer to a fraction of α\alpha of the blocks as time passes.

The key hidden detail here is how long we need to wait before the fraction becomes sufficiently close to α\alpha. GKL did not answer this question for a straightforward reason: Bitcoin does not satisfy the ideal chain-quality property.

Is there "non-ideal" chain quality?

Yes!

The GKL analytical framework (the so-called "Bitcoin backbone protocol") makes two assumptions: semi-asynchronisity (the network delay is bounded), and honest majority.

In this framework, the "real" property we're after (in a notation slightly more approachable than GKL's) is \ell-chain α\alpha-quality, where α\alpha is a number between zero and one representing the fraction of an arbitrary, possibly selfish miner while, and \ell is a positive integer designating how many blocks we look at when we check for fairness.

In other words, we can define e,αe_{\ell,\alpha} to be the expected proportion of blocks out of the next \ell blocks that were created by the α\alpha-miner, and we define α\alpha-chain-quality to be the case where e,αe_{\ell,\alpha} approaches the constant α\alpha (that is, the variance vanishes) as \ell increases (GKL's definition is actually stronger, and guarantees a deterministic upper bound, but this version is accurate enough for our purposes).

Having ideal chain quality is the same as having α\alpha-chain-quality for any α<1/2\alpha<1/2.

GKL note that Bitcoin does have perfect honest chain-quality. That is, in case everyone follows the protocol, the reward is distributed fairly. But this is a trivial statement that we know already. The analysis gets interesting when the α\alpha-miner is arbitrary, and in particular may be adversarial.

Eyal and Sirer's attack proves that Bitcoin does not have ideal chain quality. Even if we assume a very poorly connected miner (γ=0\gamma = 0), Eyal-Sirer tells us we can hope for at most 1/31/3-chain-quality, and for a perfectly connected adversary (γ=1\gamma=1) there's no α\alpha-chain-quality for any α>0\alpha>0. GKL extend the analysis to provide more accurate lower bounds on a worst-case attacker.

Note that the chain quality does not quantify how profitable selfish-mining is (for that there's the δ\delta-fairness property we will discuss a bit later), nor does it care about how quickly we obtain fairness, that is, how large do we need \ell to be to accurately estimate α\alpha by observing \ell blocks. This will also come up a bit later.

(Note that \ell can depend on α\alpha, and indeed it must be the case that \ell increases as α\alpha approaches 1/21/2, even for a fixed confidence.)

Selfish Generals?

In my experience, what I am about to explain in the following section tends to contradict how many people view Bitcoin's security properties.

The following statements are facts:

  1. In particular, BFT protocols cannot guarantee that they reach an agreement if close to half of the nodes allow themselves to deviate from the protocol

  2. In Bitcoin, we only need (slightly more than) half of the miners to be honest to assure consensus is reached in a timely manner and becomes irreversible fast

It is natural to assume that in this case, we can also use Bitcoin for binary voting: just let each miner post a "yes" or "no" on their blocks and then do a majority count.

Selfish mining proves that this is not the case: a sufficiently connected 34% attacker can create a majority of the blocks and win the vote.

Apparently, while it is true that Bitcoin can handle a higher fault threshold, the property that a collusion of more than a third of the voters could always create a majority of the blocks persists. In particular, for any protocol, including probabilistic ones, a perfectly connected adversary with more than a third of the global hashrate could always produce a majority of the blocks.

Really? Any protocol?

OK, not exactly. There are protocols that manage to provide a higher than third bound even for an attacker that wins all ties. The most known (perhaps only) example is the Bobtail protocol and its derivatives.

But there's obviously a caveat: Bobtail changes the rules for block discovery. This is a very deep consensus change that excludes Bobtail from the so-called "backbone protocol", the most standard framework for analyzing blockchain protocols (now don't get me wrong, there is a lot of meaningful analysis outside the relatively limited backbone protocol framework, but in most cases, the protocol itself can be described in this framework).

Consequently, the techniques and general theorems proved within the backbone protocol no longer apply to the protocol, forcing a proper analysis to either provide complicated reductions or generalize these theorems to more refined models. So, for example, the Bobtail paper falls short of providing a rigorous analysis or even a formal model and relies on evidence such as security proofs in limited threat models and empirical data.

So why are we even doing this?

Because the assumption of a perfectly connected adversary is unrealistic. A perfectly connected adversary always wins ties. She hears of all blocks extremely fast (no matter who created the block, she always hears about it before more than half of the network), and can transmit blocks to almost anyone.

It's hard to imagine such an adversary in reality. Even for a very well-connected adversary, there must be some other miner sufficiently distant that, by the time the attacker learns of their blocks, most honest miners also have, giving the attacker's block a zero chance to win.

Furthermore, the ability to selfishly mine becomes quite limited quite fast as γ\gamma becomes smaller. This is already apparent in the Eyal-Sirer graph above. However, we see that for low (even zero) values of γ\gamma, a sufficiently large less-than-half adversary can still accrue disproportionate gains.

Fixing this is the problem that inspired the protocols we are about to discuss.

Unselfishing Bitcoin?

The aftermath of the discovery of selfish mining led to various approaches for mitigating its effects. As usual, the Bitcoin community was averse to modifying the underlying protocol. Some suggested changes to peer-to-peer policies (such as block relaying and tie-breaking rules) that, while not prohibiting the attack, would make it harder to carry out. Others argued that selfish mining is not a significant concern, given the extreme measures required to carry it out profitably.

On the theoretical front — where the motivation is not the preservation of Bitcoin, but understanding what probabilistic consensus can or cannot do — people work on developing protocols where this problem does not exist.

Statistical Convergence Rates

We laid out our goal: to make reward shares more fair.

However, we must recall that altering the reward distribution method can also affect properties we often take for granted. One crucial aspect is how long it takes the reward share to converge into a fair distribution. Say that a protocol assures that if we track the earnings of an α<0.5\alpha<0.5 miner for long enough, they will approach a fraction of α\alpha of the total emissions. How long is "long enough"?

Even if the protocol assures that it eventually happens, it could be that as we require more exact approximations, the waiting times shoot through the roof, even if the precision we expect is completely within reason. For example, we will see soon an example where obtaining reasonable confidence that the error is less than 1%1\% can take days.

Preserving chain quality is a qualitative statement that poses no requirements on the convergence rate (except that it can be computed from the precision and confidence we expect). In practice, this is not enough.

This is not just about patience. Tracking reward distribution is crucial for the everyday operations of pools and exchanges. In a while, we will introduce shares, a way for a miner to prove to a pool that they have done part of the work. A pool typically pays out for the share in advance, based on an estimation of the actual reward. An incorrect estimation can lead to a pool bleeding money. A protocol that forces pools to either take monetary risks or withhold payouts for days or weeks is impractical.

The State of the Art

A holy grail solution to selfish mining is a protocol that assures rapidly converging chain quality even if we only assume that a majority of miners is rational.

We are not quite there yet.

The first successful attempt at aligning proof-of-work incentives is the FruitChains protocol, introduced in 2016 by Pass and Shi. FruitChains manages to find a little weak, but still very satisfying, compromise between rationality and honesty. The chain quality property holds, in the sense that an honest miner will earn as much as they should as long a majority of miners are honest.

Wait what? Isn't this precisely what we don't want to assume?

Well, its subtle: in Bitcoin, a large miner can deviate and immediately benefit, even if all other miners are honest. In FruitChain, this is no longer the case. Deviation does not penalize you, but as long as no other miner deviates in the same way, it does not gain you anything either. This is a highly non-trivial property.

However, FruitChains has a drawback that makes it unusable: the time it takes to converge increases with the security guarantees we want.

Let us make this statement a bit more quantifiable: say we want the protocol to guarantee that if we count shares for kk blocks, there's a probability of 2κ2^{-\kappa} that rewards will not deviate by more than δ>0\delta > 0. What should we set the number of shares per block λ\lambda to be? (If these notations are unclear to you, hang tight, I introduce them in more detail in the next part).

In PRS, for whatever values of κ,δ,k\kappa,\delta,k you choose, there is a corresponding λ\lambda. This is not true for FruitChains. In fruitchains, it turns out that we can only find λ\lambda when kk is large enough. Moreover, kk also grows with the fraction of maximal assumed attacker α\alpha. Writing the actual dependence is complicated, but a necessary condition is that k2(κ+log(κ/δ2))k\ge2\cdot({\kappa + \log(\kappa/\delta^2)}).

This inequality is very loose, but even the rough estimate it provides establishes the point.

Say that we want 5050 bits of security that the distribution is fair up to an error of 10%10\%. We get from the inequality above that this will require at least 126126 blocks, regardless of how much shares per block we produce.

These are very lax assumptions. More realistic security expectations are δ=0.05\delta = 0.05 and κ=64\kappa=64. Here we find that kk must be at least 160160. (And some would argue that δ=0.05\delta=0.05 is still too lax).

Even with these rough estimates, we see that the convergence runs into days. More tight analyses show that it runs into weeks and months on very reasonable parameter choices.

This remained the state-of-the-art for a while. Literature on selfish-mining-resistant protocols continued to be published, primarily focusing on no-go theorems that demonstrate the vulnerability of various natural approaches to modifying FruitChains. We will mention some of these in part two. A few protocols, such as Bobtail and Prism, tried radically different approaches that weren't adopted due to drawbacks we will not get into.

As far as I know, the only new recent approach (at least within the backbone framework) to emerge following FruitChains is Proportional Reward Splitting (PRS), authored by a collaboration of the Quai team and Dionysis Zindros (with whom I am already familiar as the coauthor of NiPoPoWs and Mining in Log-Space, famously used in Kaspa's pruning protocol).

The PRS protocol is heavily inspired by FruitChains. It is easily the most similar to FruitChains in terms of structure and assumptions. But PRS takes just enough liberties to stand on its own feet and introduces two major improvements.

First, it manages to remove the honesty assumption, replacing it with rationality. Note that this is still not the holy grail. Even in PRS, a rational majority alone is not sufficient to ensure chain quality. Even if all miners are rational, this is not guaranteed. What is guaranteed is that if all miners are rational and none deviates, then a single miner with less than half of the hash rate gains nothing by deviating. The only way to earn from deviating is if there is a majority of deviators. But again, the deviation is not penalized either.

Second, and crucially, the security argument for PRS does not limit the range of kk. In particular, it can be parametrized with as few or as many shares as you want. More shares still provide better convergence, but only because it would take less time to collect the same number of shares.

But in our world, there are no solutions, only trade-offs, and what PRS gains in speed and game-theoretic leniency, it pays back in precision. While FruitChain can, in theory, provide selfish mining resilience against any γ=0\gamma=0 attacker with less than half of the hashrate, PRS can only provide such security (though with softer assumptions) for miners with at most 40%\sim 40\% of the total hashrate.

For comparison, in Bitcoin, for γ=0\gamma=0 the Eyal-Sirer strategy has a threshold of 33%33\%. However, this attack is not optimal. A paper by Sapirshtein and Zohar et al. isolates the optimal strategies and proves that, for γ=0\gamma=0, a selfish miner with anywhere above 25%25\% of the global hashrate can mine profitably.

We can summarize this into the following table:

Bitcoin
FruitChain
PRS

Selfish mining threshold

25%25\% (γ=0\gamma=0) (Sapirshtein et al.)

50%50\%

38%42%38\%-42\%

Model assumptions

Honest majority

Rational majority

Required shares/block (λ)(\lambda) required to obtain fairness δ\delta with confidence 2κ2^{-\kappa} within kk blocks

λ=Ω(κkδ2)\lambda = \Omega\left(\frac{\kappa}{k\delta^2}\right)

λ=Ω(κkδ2)\lambda = \Omega\left(\frac{\kappa}{k\delta^2}\right)

Lower bound on kk

k2(κ+log2(κkδ2))k\ge 2\cdot\left(\kappa + \log_2\left(\frac{\kappa}{k\delta^2}\right)\right)

None

These advances arguably place PRS as the best way yet to make a chain resistant to selfish mining, but the Quai team is not satisfied. As we will see, the rational strategy is still not the best we can do. Among the rational strategies, there is a unique one that, if all miners follow, maximizes the performance of the protocol. The next goal is to modify the protocol such that all other rational strategies vanish. We will conclude the series with a discussion of this problem and current attempts at a solution.

Marco? Polo!

Before describing FruitChains and PRS, it's worth considering a similar problem: designing a mining pool.

When you think about it, mining pools have to solve a similar problem: how to divide the block reward among the pool users, such that each user is rewarded proportionally to their contribution.

The setting of mining pools is very different from selfish mining. Since the pool's decision does not affect consensus, the decision doesn't have to be inferable from the chain. Mining pools and users transmit off-chain data in the decision-making process, and only post the final decision to the chain. This is a trade-off: on the one hand, we need this additional information to make a fair decision. On the other hand, this makes the decision process vulnerable to censorship, data manipulation, and other problems that blockchains solve. Reconciling the latter drawback requires sophisticated protocols such as PPS+, PPLNS, FPPS, and others.

FruitChain and PRS essentially try to take the core idea and try to reverse it: let's post enough of this additional information on-chain, but find a way to do so that is hard to manipulate. The first step to understanding these protocols is to understand what "additional information" mining pools collect and how it is helpful. (While conveniently ignoring the problem of aggregating the information trustlessly).

So, how do mining pools work?

I had a Witty Title for this Subsection, but it was Too Corny

I like to think of miners in a pool as a search excavation looking for a treasure lost in a vast field of corn. And when I say vast, I mean vast, with billions upon trillions of corn stalks.

To keep things fair, they decide that the treasure will be divided such that each member is rewarded in proportion to how much ground they covered.

The million Quai question is of course: how can they determine how much ground each member covered?

Say that a party member called Seamus found the treasure. Based on this fact alone, it is impossible to distribute the treasure fairly. To see why that is, consider three scenarios: First, all search members worked equally hard, and Seamus lucked upon the treasure. Second, Seamus did all the work, while the rest of the party were... partying, I guess? Third, Seamus was lazing on the beach while everyone else was hard at work, but when he got peckish and went to get a cob of corn, he just happened to stumble into a treasure chest. While all three scenarios should furnish wildly different distributions, in all cases we have the same information: Seamus found the treasure.

So, we need additional information, but what information? What could the treasure hunter provide that would prove how much work they did?

One idea is to ask all excavators to collect cobs of corn as they go, but trying this, we unsurprisingly find that a vast corn field has too much corn. Excavators will rapidly find themselves hauling huge crates of corn, and counting them at the end becomes prohibitively tedious. The analogy, if it wasn't clear, is to ask each miner to provide the entire list of nonces they hashed and the resulting values, proving they indeed computed as many hashes. To fathom how unreasonable this is, recall that a single Antminer produces about 200 trillion hashes per second.

Question: You could also argue that collecting all nonces and hashes is actually much worse, as validating a nonce requires computing the hash, so arguably, validating all the nonces doesn't just need tons of space, but also the same amount of work that went into actually mining the block. Is this a good argument?

Hint

Try to find a probabilistic method to provide a very good approximation of the work distribution by only looking at a fraction of the hashes.

Answer

No, it is not a good argument.

We could validate a heap of nonces by picking a few random nonces and only checking them. How accurate is this approach?

Say that we only check kk nonces. Also say that someone tries to cheat, they sent us a lot of nonces, but a fraction of α>1\alpha>1 of them are real. What are the chances that they escape our detection?

Assuming that kk is much smaller than the number of hashes. For each hash we check, the probability it is not fake is α\alpha, making the probability all of them are not fake αk\alpha^k.

The key observation here is that the success probability of the fraud only depends on α\alpha and kk, not on the actual number of nonces.

We can set kk rather high. The computational cost of our test increases linearly with kk and (since computing a single hash is very efficient) with a very small constant. So even setting kk to one-million will not cause anyone to break a sweat.

But let's be modest and analyze k=1000k=1000. What is the probability that we miss an adversary who faked 0.1%0.1\% of their shares? This corresponds to α=0.999\alpha = 0.999, so the probability is 0.999100037%0.999^{1000} \approx 37\%.

Too much? By taking k=10000k=10000 this goes down to less than 0.005%0.005\%, and adding another zero to kk makes the result so small that even my scientific calculator just evaluates it to zero.

We see that only providing the winning nonce is not enough, and providing all of them is way too much. Is there a middle ground?

Shares are for Sharing

Consider this idea: Before the search starts, the excavation manager flies over the search area in his helicopter, randomly scattering plastic tokens across the vast corn fields. The number of tokens is tiny compared to the billions upon trillions of stalks of corn, but still large enough to count. Say, ten million of them.

Now, each member of the search excavation can collect these tokens as they encounter them, giving a rough approximation of how much ground they covered. When the treasure is finally recovered, each member will get a share proportional to the number of tokens they found. For example, if Alice, Bobette, and Charline found 3, 5, and 7 tokens, then Bobette will get a fraction of 5/(3+5+7)5/(3+5+7), or one-third, of the treasure. Note that we only divide by the number of found tokens.

To translate this intuition to blockchains, recall that in proof-of-work, attempting to solve the block means changing it a little bit (in particular, a field called the nonce, designated for that purpose) and hashing it. The hashing operation will return, for each attempt, a uniformly random number between 00 and N1N-1 for some huge NN (typically, N=2256N=2^{256}). Solving the block means finding a nonce that hashes below some difficulty target TT.

In our search excavation analogy, the members are miners, the ground they cover is the nonces they tried, and the treasure is a nonce that hashes to a number smaller than the difficulty target. But what are the plastic tokens?

An ubiquitous way to capture them is in terms of something called a subblock.

Say that we are looking for a nonce that hashes below TT, how many nonces that hash below 2T2T do we expect to find along the way?

Well, since the hash value is uniformly random, hashing below 2T2T is twice as likely. So we expect to find two such nonces along the way. Similarly, we expect to find 33 hashes below 3T3T and, more generally, nn hashes below nTn\cdot T.

We usually like nn to be a power of two, so we assume n=2kn=2^k. If the difficulty target is TT, we call a block with a nonce that causes it to hash below 2kT2^k\cdot T a kk-subblock.

Question: What do you think is the reason we use kk-subblocks to refer to blocks that hash below 2kT2^k\cdot T and not kTk\cdot T?

Answer

Because it is often much more natural to think how much a block is eavier/lighter than it should be in terms of how many leading zeros it has, and not the actual numerical value. A kk-subblock has at most kk more leading zeros than a full block.

In proof of work it is often more natural to talk in terms of entropy, which essentially counts bits. If TT happens to be a power of two, T=2mT=2^m, then saying that the output is "smaller than TT" is saying that all digits in the output, except perhaps the least mm digits, are zero. We allow a kk-subblock to be 2k2^k times larger, which means we soften the requirement to allow the least m+km+k digits to be non-zero. That is, a kk-subblock is required to have kk zeros less than a valid block..

Say we set k=10k=10, then finding a kk-subblocks is 21010002^{10} \approx 1000 easier than finding a block. In other words, we expect about a thousand kk-subblocks to be found in the process of looking for the block. Moreover, the number of subblocks each miner finds should be proportional to the fraction of the work they did. Just like an excavator that covers twice as large an area will find roughly twice as many plastic chips (and the approximation becomes rapidly more accurate as we increase the number of chips).

Exercise: The analogy I made is not entirely accurate. Can you spot the lie? How would you modify the algorithm such that it matches the definition?

Hint

Think about how shares distribute with respect to blocks. Now, think about how plastic chips distribute with respect to treasures.

Answer

Note that in our definition, a block is always a subblock. We defined a kk-subblock as having a hash below 2kT2^k\cdot T, and a block as having a hash below TT. Clearly satisfying the latter term implies satisfying the former.

In our analogy, a "block" is a treasure, and a "subshare" is a chip. However, it is possible to find a treasure without finding a chip! Fixing the analogy means making sure there is a chip in every treasure chest. However, we actually prefer to go the other way, and fix the protocol to match the analogy. Why? Because generally speaking, unneeded dependencies should be avoided, and there's no good reason to insist on retaining a dependency between blocks and shares we can easily remove.

We can remove this by using what's known as the "two-for-one PoW trick". To check if a nonce gives a block, hash it and check whether the result is smaller than TT. To check if a nonce gives a share, hash it, reverse the result (when written as a binary string), flip its bits, and check whether the result is smaller than 2kT2^k\cdot T. The idea is that when checking for one thing, we use the rightmost bits, and when checking for the other, we use the leftmost bits. That is, for one purpose we read the number left-to-right, and for the other we read it right-to-left.

If we assume 2log(T)+klog(N)2 \log(T) + k \ll \log(N), we get that the dependency between being a block and being a share is overwhelmingly small. The details are left as a recommended exercise in elementary probability theory.

Subblocks/shares seem to be a handy tool indeed to probe the hash rate among miners. But applying it requires more work. For protocols that want to use them on-chain, we need to find a way to include them that will not degrade the security of the network and will not be vulnerable to the same selfish mining attacks that exist in Bitcoin.

Last updated