Part I: Bitcoin
How Eyal and Sirer ruined Bitcoin forever (but not really)
As we will see, the rational strategy is still not good enough; among the rational strategies, there is a unique one that, if all miners follow, maximizes the performance of the protocol.In their paper, Eyal and Sirer present a strategy that increases the profit of a large miner by withholding blocks.
The idea is that by withholding the blocks they created, only releasing them at strategic times, a large miner can increase the orphan rates of the rest of the network more than its own orphan rates, causing their proportion of non-orphaned work to increase, whereby earning then a bigger cut of the reward than their contribution to mining.
Bitcoin and Selfishness
For those who want the details of the Eyal-Sirer attack, as well as an overview of the subsequent attacks and countermeasures, I recommend the relevant section of my book.
The upshot of the attack, in concrete number form, is summarized in this figure lifted from their paper:

is the tie-winning probability. Imagine that Alice withholds a block until she hears about a competing block from the honest network, and then she releases her withheld block. We let be the probability that the network prefers Alice's withheld block over the honest block. It is a measure of how "well connected" Alice is.
We see that even if we assume zero connectivity, an attacker with more than 33% can profitably carry out the attack, while as little as 42% suffices to create a majority of the blocks non-orphaned.
With perfect connectivity, 33% is already enough to create a majority of the non-orphaned blocks, and any perfectly connected attacker has something to gain from a selfish mining attack.
The most realistic scenario is that the attacker's connectivity parameter is somewhere in between, say halfway. In this case, we see that 25% is enough for a profitable attack, while 38% is enough to create a majority of non-orphaned blocks.
A common misconception is that because a 38% attacker can create a majority of the blocks, she can double-spend. It makes sense that by making more blocks than the honest network, you can create a competing chain, but that's not what you do in selfish mining. The total fraction of blocks remains the same, but you constantly headbutt with honest blocks to kick them off the chain. You can only do this by pointing at, or near, the heaviest tip, which is impossible if you spend your efforts on mining a side chain.
A successful double-spend attack looks like this:

whereas a successful selfish mining attack looks like this:

Note that in the latter example, the red blocks constitute the majority of chain blocks, but the blue blocks are still the majority of all blocks. That is, judging from the chain alone it seems like the red miner has a majority of the hashrate, but the orphans reveal that this is not quite the case. In this scenario, the attacker will receive a majority of the rewards despite not mining a majority of the block, which is exactly the crux of selfish mining attacks.
Chain Quality (and Lack Thereof)
A key nuance in the discussion above is that we don't like selfish mining, but it doesn't allow a double-spend attack. In other words, it is a behavior that is considered adversarial, but is not captured in the colloquial definition of what makes a blockchain secure.
It is natural to define a security notion that captures selfish mining attacks. That is exactly what Garay, Kiayias, and Leonardos did when they described the Bitcoin backbone protocol, a version of Bitcoin that is simplified enough for mathematical analysis, but more accurate than previously used models.
In that paper, they introduce the chain quality property. The precise definition is a bit technical, but the gist is this:
Definition (reproduced from Definition 4 in GKL, simplified, informal): A blockchain has ideal chain-quality if the amount of non-orphaned blocks produced by a miner with a fraction of the global hashrate gets closer to a fraction of of the blocks as time passes.
The key hidden detail here is how long we need to wait before the fraction becomes sufficiently close to . GKL did not answer this question for a straightforward reason: Bitcoin does not satisfy the ideal chain-quality property.
Selfish Generals?
In my experience, what I am about to explain in the following section tends to contradict how many people view Bitcoin's security properties.
The following statements are facts:
Deterministic finality cannot handle a situation where more than a third of the nodes are faulty.
In particular, BFT protocols cannot guarantee that they reach an agreement if close to half of the nodes allow themselves to deviate from the protocol
In Bitcoin, we only need (slightly more than) half of the miners to be honest to assure consensus is reached in a timely manner and becomes irreversible fast
It is natural to assume that in this case, we can also use Bitcoin for binary voting: just let each miner post a "yes" or "no" on their blocks and then do a majority count.
Selfish mining proves that this is not the case: a sufficiently connected 34% attacker can create a majority of the blocks and win the vote.
Apparently, while it is true that Bitcoin can handle a higher fault threshold, the property that a collusion of more than a third of the voters could always create a majority of the blocks persists. In particular, for any protocol, including probabilistic ones, a perfectly connected adversary with more than a third of the global hashrate could always produce a majority of the blocks.
So why are we even doing this?
Because the assumption of a perfectly connected adversary is unrealistic. A perfectly connected adversary always wins ties. She hears of all blocks extremely fast (no matter who created the block, she always hears about it before more than half of the network), and can transmit blocks to almost anyone.
It's hard to imagine such an adversary in reality. Even for a very well-connected adversary, there must be some other miner sufficiently distant that, by the time the attacker learns of their blocks, most honest miners also have, giving the attacker's block a zero chance to win.
Furthermore, the ability to selfishly mine becomes quite limited quite fast as becomes smaller. This is already apparent in the Eyal-Sirer graph above. However, we see that for low (even zero) values of , a sufficiently large less-than-half adversary can still accrue disproportionate gains.
Fixing this is the problem that inspired the protocols we are about to discuss.
Unselfishing Bitcoin?
The aftermath of the discovery of selfish mining led to various approaches for mitigating its effects. As usual, the Bitcoin community was averse to modifying the underlying protocol. Some suggested changes to peer-to-peer policies (such as block relaying and tie-breaking rules) that, while not prohibiting the attack, would make it harder to carry out. Others argued that selfish mining is not a significant concern, given the extreme measures required to carry it out profitably.
On the theoretical front — where the motivation is not the preservation of Bitcoin, but understanding what probabilistic consensus can or cannot do — people work on developing protocols where this problem does not exist.
Statistical Convergence Rates
We laid out our goal: to make reward shares more fair.
However, we must recall that altering the reward distribution method can also affect properties we often take for granted. One crucial aspect is how long it takes the reward share to converge into a fair distribution. Say that a protocol assures that if we track the earnings of an miner for long enough, they will approach a fraction of of the total emissions. How long is "long enough"?
Even if the protocol assures that it eventually happens, it could be that as we require more exact approximations, the waiting times shoot through the roof, even if the precision we expect is completely within reason. For example, we will see soon an example where obtaining reasonable confidence that the error is less than can take days.
Preserving chain quality is a qualitative statement that poses no requirements on the convergence rate (except that it can be computed from the precision and confidence we expect). In practice, this is not enough.
This is not just about patience. Tracking reward distribution is crucial for the everyday operations of pools and exchanges. In a while, we will introduce shares, a way for a miner to prove to a pool that they have done part of the work. A pool typically pays out for the share in advance, based on an estimation of the actual reward. An incorrect estimation can lead to a pool bleeding money. A protocol that forces pools to either take monetary risks or withhold payouts for days or weeks is impractical.
The State of the Art
A holy grail solution to selfish mining is a protocol that assures rapidly converging chain quality even if we only assume that a majority of miners is rational.
We are not quite there yet.
The first successful attempt at aligning proof-of-work incentives is the FruitChains protocol, introduced in 2016 by Pass and Shi. FruitChains manages to find a little weak, but still very satisfying, compromise between rationality and honesty. The chain quality property holds, in the sense that an honest miner will earn as much as they should as long a majority of miners are honest.
Wait what? Isn't this precisely what we don't want to assume?
Well, its subtle: in Bitcoin, a large miner can deviate and immediately benefit, even if all other miners are honest. In FruitChain, this is no longer the case. Deviation does not penalize you, but as long as no other miner deviates in the same way, it does not gain you anything either. This is a highly non-trivial property.
However, FruitChains has a drawback that makes it unusable: the time it takes to converge increases with the security guarantees we want.
Let us make this statement a bit more quantifiable: say we want the protocol to guarantee that if we count shares for blocks, there's a probability of that rewards will not deviate by more than . What should we set the number of shares per block to be? (If these notations are unclear to you, hang tight, I introduce them in more detail in the next part).
In PRS, for whatever values of you choose, there is a corresponding . This is not true for FruitChains. In fruitchains, it turns out that we can only find when is large enough. Moreover, also grows with the fraction of maximal assumed attacker . Writing the actual dependence is complicated, but a necessary condition is that .
This inequality is very loose, but even the rough estimate it provides establishes the point.
Say that we want bits of security that the distribution is fair up to an error of . We get from the inequality above that this will require at least blocks, regardless of how much shares per block we produce.
These are very lax assumptions. More realistic security expectations are and . Here we find that must be at least . (And some would argue that is still too lax).
Even with these rough estimates, we see that the convergence runs into days. More tight analyses show that it runs into weeks and months on very reasonable parameter choices.
This remained the state-of-the-art for a while. Literature on selfish-mining-resistant protocols continued to be published, primarily focusing on no-go theorems that demonstrate the vulnerability of various natural approaches to modifying FruitChains. We will mention some of these in part two. A few protocols, such as Bobtail and Prism, tried radically different approaches that weren't adopted due to drawbacks we will not get into.
As far as I know, the only new recent approach (at least within the backbone framework) to emerge following FruitChains is Proportional Reward Splitting (PRS), authored by a collaboration of the Quai team and Dionysis Zindros (with whom I am already familiar as the coauthor of NiPoPoWs and Mining in Log-Space, famously used in Kaspa's pruning protocol).
The PRS protocol is heavily inspired by FruitChains. It is easily the most similar to FruitChains in terms of structure and assumptions. But PRS takes just enough liberties to stand on its own feet and introduces two major improvements.
First, it manages to remove the honesty assumption, replacing it with rationality. Note that this is still not the holy grail. Even in PRS, a rational majority alone is not sufficient to ensure chain quality. Even if all miners are rational, this is not guaranteed. What is guaranteed is that if all miners are rational and none deviates, then a single miner with less than half of the hash rate gains nothing by deviating. The only way to earn from deviating is if there is a majority of deviators. But again, the deviation is not penalized either.
Second, and crucially, the security argument for PRS does not limit the range of . In particular, it can be parametrized with as few or as many shares as you want. More shares still provide better convergence, but only because it would take less time to collect the same number of shares.
But in our world, there are no solutions, only trade-offs, and what PRS gains in speed and game-theoretic leniency, it pays back in precision. While FruitChain can, in theory, provide selfish mining resilience against any attacker with less than half of the hashrate, PRS can only provide such security (though with softer assumptions) for miners with at most of the total hashrate.
For comparison, in Bitcoin, for the Eyal-Sirer strategy has a threshold of . However, this attack is not optimal. A paper by Sapirshtein and Zohar et al. isolates the optimal strategies and proves that, for , a selfish miner with anywhere above of the global hashrate can mine profitably.
We can summarize this into the following table:
Model assumptions
Honest majority
Rational majority
Required shares/block required to obtain fairness with confidence within blocks
Lower bound on
None
These advances arguably place PRS as the best way yet to make a chain resistant to selfish mining, but the Quai team is not satisfied. As we will see, the rational strategy is still not the best we can do. Among the rational strategies, there is a unique one that, if all miners follow, maximizes the performance of the protocol. The next goal is to modify the protocol such that all other rational strategies vanish. We will conclude the series with a discussion of this problem and current attempts at a solution.
Marco? Polo!
Before describing FruitChains and PRS, it's worth considering a similar problem: designing a mining pool.
When you think about it, mining pools have to solve a similar problem: how to divide the block reward among the pool users, such that each user is rewarded proportionally to their contribution.
The setting of mining pools is very different from selfish mining. Since the pool's decision does not affect consensus, the decision doesn't have to be inferable from the chain. Mining pools and users transmit off-chain data in the decision-making process, and only post the final decision to the chain. This is a trade-off: on the one hand, we need this additional information to make a fair decision. On the other hand, this makes the decision process vulnerable to censorship, data manipulation, and other problems that blockchains solve. Reconciling the latter drawback requires sophisticated protocols such as PPS+, PPLNS, FPPS, and others.
FruitChain and PRS essentially try to take the core idea and try to reverse it: let's post enough of this additional information on-chain, but find a way to do so that is hard to manipulate. The first step to understanding these protocols is to understand what "additional information" mining pools collect and how it is helpful. (While conveniently ignoring the problem of aggregating the information trustlessly).
So, how do mining pools work?
I had a Witty Title for this Subsection, but it was Too Corny
I like to think of miners in a pool as a search excavation looking for a treasure lost in a vast field of corn. And when I say vast, I mean vast, with billions upon trillions of corn stalks.
To keep things fair, they decide that the treasure will be divided such that each member is rewarded in proportion to how much ground they covered.
The million Quai question is of course: how can they determine how much ground each member covered?
Say that a party member called Seamus found the treasure. Based on this fact alone, it is impossible to distribute the treasure fairly. To see why that is, consider three scenarios: First, all search members worked equally hard, and Seamus lucked upon the treasure. Second, Seamus did all the work, while the rest of the party were... partying, I guess? Third, Seamus was lazing on the beach while everyone else was hard at work, but when he got peckish and went to get a cob of corn, he just happened to stumble into a treasure chest. While all three scenarios should furnish wildly different distributions, in all cases we have the same information: Seamus found the treasure.
So, we need additional information, but what information? What could the treasure hunter provide that would prove how much work they did?
One idea is to ask all excavators to collect cobs of corn as they go, but trying this, we unsurprisingly find that a vast corn field has too much corn. Excavators will rapidly find themselves hauling huge crates of corn, and counting them at the end becomes prohibitively tedious. The analogy, if it wasn't clear, is to ask each miner to provide the entire list of nonces they hashed and the resulting values, proving they indeed computed as many hashes. To fathom how unreasonable this is, recall that a single Antminer produces about 200 trillion hashes per second.
Question: You could also argue that collecting all nonces and hashes is actually much worse, as validating a nonce requires computing the hash, so arguably, validating all the nonces doesn't just need tons of space, but also the same amount of work that went into actually mining the block. Is this a good argument?
We see that only providing the winning nonce is not enough, and providing all of them is way too much. Is there a middle ground?
Shares are for Sharing
Consider this idea: Before the search starts, the excavation manager flies over the search area in his helicopter, randomly scattering plastic tokens across the vast corn fields. The number of tokens is tiny compared to the billions upon trillions of stalks of corn, but still large enough to count. Say, ten million of them.
Now, each member of the search excavation can collect these tokens as they encounter them, giving a rough approximation of how much ground they covered. When the treasure is finally recovered, each member will get a share proportional to the number of tokens they found. For example, if Alice, Bobette, and Charline found 3, 5, and 7 tokens, then Bobette will get a fraction of , or one-third, of the treasure. Note that we only divide by the number of found tokens.

To translate this intuition to blockchains, recall that in proof-of-work, attempting to solve the block means changing it a little bit (in particular, a field called the nonce, designated for that purpose) and hashing it. The hashing operation will return, for each attempt, a uniformly random number between and for some huge (typically, ). Solving the block means finding a nonce that hashes below some difficulty target .
In our search excavation analogy, the members are miners, the ground they cover is the nonces they tried, and the treasure is a nonce that hashes to a number smaller than the difficulty target. But what are the plastic tokens?
An ubiquitous way to capture them is in terms of something called a subblock.
Say that we are looking for a nonce that hashes below , how many nonces that hash below do we expect to find along the way?
Well, since the hash value is uniformly random, hashing below is twice as likely. So we expect to find two such nonces along the way. Similarly, we expect to find hashes below and, more generally, hashes below .
We usually like to be a power of two, so we assume . If the difficulty target is , we call a block with a nonce that causes it to hash below a -subblock.
Question: What do you think is the reason we use -subblocks to refer to blocks that hash below and not ?
Say we set , then finding a -subblocks is easier than finding a block. In other words, we expect about a thousand -subblocks to be found in the process of looking for the block. Moreover, the number of subblocks each miner finds should be proportional to the fraction of the work they did. Just like an excavator that covers twice as large an area will find roughly twice as many plastic chips (and the approximation becomes rapidly more accurate as we increase the number of chips).
Exercise: The analogy I made is not entirely accurate. Can you spot the lie? How would you modify the algorithm such that it matches the definition?
Subblocks/shares seem to be a handy tool indeed to probe the hash rate among miners. But applying it requires more work. For protocols that want to use them on-chain, we need to find a way to include them that will not degrade the security of the network and will not be vulnerable to the same selfish mining attacks that exist in Bitcoin.
Last updated