Understanding BlockDAGs and GHOSTDAG
My WebsiteSupport My Work
  • Welcome!
  • Preface
  • Introduction
  • Part 1: BlockChains and BlockDAGs
    • Introduction
    • Chapter 1: From BFT to PoW
      • Byzantine Fault Tolerance
      • Proof-of-Work (PoW)
      • How PoW Works*
      • PoS Vs. PoW
      • Exercises
    • Chapter 2: the Block Chain Paradigm
      • A Graph Theory Primer
      • The Paradigm
      • Honesty and Rationality
      • Three Chain Selection Rules
      • Security Notions
      • A Security Model for Blockchain Consensus
      • Selfish Mining in Bitcoin
      • Safety
      • Liveness
      • Confirmation Times
      • Proving Bitcoin's Security*
      • Exercises
  • Part 2: The GHOSTDAG Protocol
    • Page 2
  • Supplementary Material
    • Math
      • Stuff You Should Know
        • Asymptotics, Growth, and Decay
        • Binomial Coefficients
        • The Binomial Formula
        • Geometric Sums and Series
      • Probability Theory
        • Probability Spaces
        • Random Variables
        • The Math of Block Creation
    • Computer Science
      • Stuff you Should Know
        • Asymptotic Notation
        • Negligible Functions
      • Cryptography
        • Random Oracles
        • Merkle Trees
        • One Time Pads
Powered by GitBook
On this page
  • Graphs
  • DAGs
  • Trees
  1. Part 1: BlockChains and BlockDAGs
  2. Chapter 2: the Block Chain Paradigm

A Graph Theory Primer

PreviousChapter 2: the Block Chain ParadigmNextThe Paradigm

Last updated 3 months ago

You have probably noticed that most of the mathematical exposition has been deferred to an . However, basic graph theory shall be the exception to the rule.

Graphs are important to understand even for people who only seek cursory, informal understanding. They aren't just a theoretical instrument, they underwrite all of the visual language used to illustrate and demonstrate pretty much every idea throughout this book. Understanding what they represent could be extremely valuable to the layperson. Besides, they are easy and fun, so why not?

Graphs

A (directed) graph is nothing but a bunch of circles called nodes or vertices, and a bunch of arrows between them called edges. For example:

We are usually interested in less whacky graphs, so we assume there are no multiple arrows between the two same vertices, and no loopy arrows going from a vertex to itself, for example:

This is called a simple (directed) graph.

In most textbooks, directed graphs are considered an intermediate topic, and the discussions start from undirected graphs. These are graphs where for any two vertices AAA, BBB there is an arrow from AAA to BBB if and only if there is an arrow from BBB to AAA. In this case, the two arrows are compressed into a single, arrowless line. For example:

Unlike most textbooks, we will be concerned almost exclusively with directed graphs, so we will employ a backward convention: a graph is assumed to be directed unless explicitly stated otherwise.

Usually undirected graphs are used to describe reciprocal affinities: for example, the graph whose nodes are Facebook users, and there is an edge between any two people if and only if they are friends on Facebook, is undirected. That's because being friends on Facebook is subject to mutual agreement and is thus reciprocal. In contrast, Twitter allows users to follow other users without being followed back, and this lack of reciprocity is manifest in the directedness of the graph.

DAGs

We want to use graphs to describe causality or chronology. It makes sense that such a graph would be directed, because it does not make sense that an event AAA caused or preceded event BBB while event BBB caused or preceded event AAA.

However, how will you understand the chronology described by this graph:

Does it say that event AAA happened before event BBB that happened before event CCC, that happened before event AAA?

Given a simple directed graph, we say that the vertex BBB is reachable from the vertex AAA if there is some way to get from AAA to BBB by only following arrows. For example:

Each way to reach BBB from AAA is called a path. A path is just a list of arrows such that the first arrow leaves AAA, the last arrow enters BBB, and any arrow in between leaves the vertex entered by the previous arrow.

For example, in the graph above:

Note that in our convention, a path must have at least one arrow. Staying in place doesn't count. Under this convention, a cycle is a path from any vertex AAA to itself. Given the name, the following definition should not come as a surprise.

Definition: A directed acyclic graph (DAG) is a simple directed graph with no cycles.

The DAGs that will interest us have one more useful property, they are rooted. That is, there is a vertex that we call GGG (alluding to the genesis block) with the property that it is reachable from all other vertices. We call this vertex the root of the DAG.

Note that I assumed above that if the DAG is rooted then it only has one root. It is easy to prove that this is indeed the case, and there could not be a rooted DAG with two roots.

Trees

In the context of block chains, we are interested in a particular type of rooted DAG called a tree:

Definition: A rooted DAG is called a tree if all vertices have at most one arrow leaving them

One can see that the definition is less flexible than seems. The properties of a rooted DAG actually imply that there are zero arrows leaving the root vertex, and one arrow leaving any other vertex. A node that has no arrow going into it is called a tip or a leaf.

Given any node beside the root, we call the nodes it is pointing at its parent. Given any node that is not a tip, we call the nodes pointing at it its children. Note that a vertex can have many children but only one parent. Hence, for any non-root vertex BBB we can define without ambiguity that B.ParentB.ParentB.Parent is the unique vertex that BBB points to:

In pretty much any graph theory book you open, you will find an opposite convention. Trees are either undirected, or directed with the arrows pointing away from the root, making the parents further from the root than the children. This is not because computer scientists think that trees grow from foliage to root, but because trees are often used to describe many possible routes that all converge to the same conclusion, and from this vantage the root is the "end of time" and not the "beginning of time". However, in this book we assume that the root is the oldest block, encouraging an inverse convention.

Note that in a rooted tree there is only one path from any (non-root) vertex to the root (in the exercises you will show the converse, that a rooted DAG where there's only one path from any non-root vertex to the root is a tree). We call this path the chain of BBB, and denote it B.ChainB.ChainB.Chain.

If we are being completely pedantic, the existence of is still very much an open question. Nevertheless, we very reasonably assume assumption that nodes running our blockchain will never traverse such a loop. Let us carefully define what this means, the terms we introduce for that purpose will accompany us for the rest of the book.

closed timelike curves
appendix
A whacky graph
a simple graph
an undirected graph
A closed timelike curve
AAA and A′A'A′ are both reachable from BBB. However, BBB is reachable from AAA but not from A′A'A′.
There is a single path from AAA to BBB (red), but three different paths from BBB to A′A'A′ (orange, blue, and green)