How PoW Works*
It the basis of PoW we have a hash function. You know, that proverbial "complex mathematical puzzle" that miners are trying to solve? Well, first of all, it is not a "complex mathematical puzzle" in any way, it is actually the dumbest possible puzzle, a puzzle that can only be solved by trying all possible solutions one by one until we hit one that works. Grated, it is very complex to create such a dumb puzzle, but the puzzle is nonetheless very dumb, a magic 8-ball you keep shaking until you get the desired output. And that's a good thing. You do not want miners to get clever and find all sorts of shortcuts that will undermine your security. The entire point of proof of work is that there are no ways to "clever away" the hardness of the problem.
Hashes as Random Oracles
As I said, it takes seriously smart people to design a sufficiently dumb "puzzle", so I will treat the problem like a cryptographer and assume someone already solved the problem for me. More formally, I will assume the existence of a random oracle: a magic random function that floats in the sky, accessible to anyone.
I survey random oracles to some extent in the appendix, but all you should know about it is that it transforms arbitrary strings (such as block headers) into strings of 256 bits, such that if you input a new string, you get a completely random output, but if you input the same string again, you will get the same output.
Random oracles obviously do not exist in reality, but we can create things that are pretty darn close. For example, hash functions try to do just that. We just model the hash function as a random oracle, and ignore the implementation details.
To simplify our notations, we call the has function and let denote the string it outputs on input . Further, we will not think of as a bit string, but just as an integer .
To make this work, we need to make an additional assumption on our hash function: that it makes exactly one query per seconds. However, I stress that this is an unrealistic assumption, and PoW cryptocurrencies have to compensate for that by employing difficulty adjustment. I explain how difficulty adjustment works in Bitcoin in the next section. But for now, we assume that is fixed and known.
The Dumb Puzzle
As a warmup, consider the following problem: I choose an arbitrary number and ask you to find any string such that . How would you go about trying? The only way you have to probe on different strings until you find one that works.
Now say you have two different strings and . The numbers and were sampled uniformly, which means that knowing tells you nothing about . Consequentially, if while trying to solve the problem you query the oracle on and find that this actually tells you very little. It tells you that doesn't solve the problem, and that's it.
Each such attempt has a probability of one in to succeed, so it will take (on average) seconds to find a solution. If is a trillionth of a second, this would take just over three billion trillion trillion trillion trillion years.
Now, what if we modify the puzzle like this: I choose two arbitrary numbers and and ask you to find such that either or ? Well, now the success probability is twice as large, so the time it takes to solve will be half has long, coming up at .
Instead of choosing some arbitrary and we could choose and . So our puzzle becomes: find such that . (similarly, in the previous puzzle we could choose , so the puzzle becomes to find such that .)
We can generalize this by choosing some and setting the puzzle to find some such that . The time expected time it would take to solve now becomes . If we want to make the average block delay seconds long, we can achieve this by choosing such that . We call this the difficulty target.
Solving this for we find the simple relation . For example, if we assume that (that is, that the global hashrate is one petahash), and want a ten minutes block delay, you should set
One peculiarity with the difficulty target is that lower targets make for harder blocks. This makes sense, as hitting a number below one million is harder than hitting a number below one trillion.
So does this solve all our problems? Not really. If the puzzle is just to find some such that then a miner could just find this number once, and use it as "proof" for all eternity.
The solution is to change the puzzle such that the string depends on the block. That's why we say that the miner is mining a block. The idea is this: we can separate a block into two parts, header and data. The header of the block contains a few fields of information that depend on the data therein, and an important field called nonce that can be any number. Why is this field there? Because that's exactly what the miner changes in order to get a sufficiently low hash.
Let be a block header without the nonce set, and let be that same block with the nonce set to , When a miner mines for the block , what they actually do is compute for different values of until they find a value for which .
So are we done? Not quite. The remaining question is, if we only hash the header, what does this say about the data? Can't we just change the data, and use the header as proof? The "obvious" solution is to hash the entire block instead of just the header, but that will make mining much harder and more centralized. Instead, Satoshi had the foresight to use a cool construction called Merkle trees. For our current purposes you don't have to understand exactly what a Merkle tree is, only that it allows us to take an arbitrary amount of information and extract from it a 256 bit string called the Merkle root such that it is impossible to manipulate the data without changing the Merkle root. By including the Merkle root in the block header, we get that the string affirms the veracity of the block contents as well.
Difficulty Adjustment in Bitcoin
To understand difficulty adjustment, it is instructive to understand how it s done in Bitcoin. Many argue that Bitcoin's approach to difficulty adjustment is outdated, has many drawbacks, and can even be dangerous, while others argue that it is time tested and its simplicity protects us from unexpected attacks. So essentially, a carbon copy of any other argument about Bitcoin.
Bitcoin's approach is to use a fixed window. That is, the difficulty is readjusted at fixed intervals. In Bitcoin, there is a difficulty readjustment once every blocks, which are supposed to represent two weeks.
The idea is to check how long it took to create these blocks, and compare it to the expected two weeks. But let us use denotations instead of concrete numbers.
Say that we set the length of an epoch to blocks, then we expect the length of the window to be seconds, where is the block delay. At the end of the epoch, we check how long it actually took the network to create these blocks. Say it took block delays, that is, seconds. Then we get that the ratio between the desired and actual block creation speed is . So to adjust the difficulty, we need to change the time it takes to mine a block by a factor of . That is, we change the difficulty from to .
Handling Timestamps
This is all good and well, but there is one problem: how do we know how long it took to create the blocks?
Well, each block includes a timestamp that allegedly reports when it was discovered, but it cannot be authenticated. It could be that the miner's clock is out of sync, or worse, that she is deliberately trying to interfere with the network by messing with the difficulty adjustment.
As a first line of defense, Bitcoin bounds the difficulty change to a factor of four. No matter what the timestamps look like, if the old and new targets are and , then we will always have that . But that's obviously not enough.
So how can we avoid timestamp manipulations? If we just take the first and last timestamp of the epoch and check their difference, and the malfeasant miner happens to create the first or last block, she essentially gets a carte blanche to set the difficulty to whatever she wants. The adjustment cannot depend just on two blocks.
One might be tempted to tackle this by analyzing the timestamps within the epoch better, and trying to isolate the "true" ones. Indeed, one can improve the robustness using such approaches, but any such mechanism would be exploitable if it can be fed arbitrary timestamps.
Instead, the solution is to impose smart limitations on the block's timestamp when it is processed.
The idea is not to allow a timestamp to be more than two hours off the mark. If we can guarantee that, then the worse an attacker can do is to compress/stretch the difficulty window by four hours, which are about of two weeks. But how can we guarantee it?
The first observation is that we can verify that a timestamp is not too deep into the past from the consensus data. Since a block delay is ten minutes, we expect blocks to be created in two hours. So what should we do? Should we go 12 blocks back from the candidate block, and check that it has an earlier timestamp? That happens to be exploitable. If the miner happens to have created that block too, and the block 12 blocks before that one, and the blocks 12 blocks before that one, and so on. Each block in this chain allows two more hours of manipulation. Bitcoin overcomes this by taking the timestamps of the last blocks, and picking the median. Highly irregular timestamps will not be the median, but in the fringes.
So say we want to enforce a timestamp deviation of at most block delays. Say that the block has a timestamp , and let be the median timestamp of the blocks preceding it, the we have the following rule
Rule I: if then the block is invalid.
OK, but what about blocks whose timestamp deviates into the future? Now we don't have any blocks to rely on. We can't have the validity of a block depend on the blocks succeeding it!
The idea is to use the actual system clock. Let be the system clock while validating the block. The length of time is an approximation of actual block delays (that is, according to the current, possibly inaccurate difficulty target). We would like to invalidate blocks whose timestamps are more than later than , and it is easy to check this just means that . The problem is that , unlike , is not in consensus, and can vary from miner to miner. The solution is to delay the block:
Rule II: if , wait until before including the block, if while waiting a new valid block arrived, prefer it over the delayed block.
Note that miners are incentivized to follow this validation rule, as it gives them more time to create a competing block.
This does not prohibit blocks with timestamps set to the far future, but it strongly discourages them. The later the timestamp is, the more miners will delay mining over it, increasing the probability that the block will be orphaned.
With these two rules in place, we can finally enforce the very simple policy: at the end of an epoch, find the earliest and latest time stamps in the epoch, set , and adjust the difficulty as explained in the previous section. The timestamp deviation rules we outlined strongly limit the ability to abuse this mechanism through timestamp manipulation.
One might be curious why we "look for the earliest and latest timestamps" and not just take the first and last one. The answer is that timestamps in Bitcoin are not always monotone. In fact, there are know examples of later blocks with earlier timestamps. Amusingly enough, I found that out from a 2014 paper that quotes a paper from 2015.
Last updated