Asymptotics, Growth, and Decay

When considering stuff like the security and efficiency of algorithms and procedures, and in particular cryptographic schemes and block chains, we need a way to relate them to the various parameters. Consider for example a procedure that sorts an array: it is given a list of numbers, and returns them in increasing order. How can we describe "how fast" that procedure is? We can ask how fast it takes to sort ten numbers, or twenty, but that won't tell us much if in the future we want to use it to sort 1357 or 20545 numbers. We need some way to talk about how the length of the computation grows.

For this we use a function ff that is given the input length nn and outputs "how long" it would take to sort an array of this size.

Now, the term "how long" is also ill conceived. How do we count? In seconds? CPU cycles? Don't all such metrics depend on the computer running it? Can't they be different for two different lists with the same length?

The answer to all of that is yes. The whole point of discussing growth is to put that aside, and just look at the trajectory. This is often called the asymptotics of the function.

It is true that the asymptotics doesn't have all the information, but it tells us what happens in the long run. That "other part", that doesn't tell us about the rate of growth, but gives us concrete numbers (which is often just referred to as "the constants"), is also very important. But from some vantages, like the one permeating most of this book, the asymptotic point of view is the appropriate one.

Asymptotic Notation

To streamline the notations, computer scientists came up with a bit confusing yet very powerful notational system that captures the "type of growth" of the function while ignoring everything else.

Say that ff is some function that interests us, like the runtime of a sorting algorithm given the length nn of the list it needs to sort. As we said, ff is quite complicated and not even completely defined. However, we can say something like ff grows "at most linearly". What do we mean by that? That while ff might be a bit erratic and unexpected, we know that there is a function that looks like a line such that ff is almost always below it. This would look something like this:

Illustration of f=O(n)f=O(n), the red line is the function 2n+62n+6

We use the notation f=O(n)f=O(n) to say that there is some line such that ff is below it from some point onward. In other words, we say that f=O(n)f=O(n) if there exist two numbers mm and bb such that for any sufficiently large nn we have that f(n)<mn+bf(n) < mn + b. Note that the notation f=O(n)f=O(n) hides the actual numbers m,bm,b and only tells us they exist. This is the power of asymptotic notation, it hides the constants and allows us to qualitatively talk about the growth trend. So yeah, the constants may change based on the exact architecture of the system that will eventually run the code, but for any system there will be some constants fitting the bill, and that is what captured by this notation.

This is often called the big oh notation.

Exercise: Prove that n2O(n)n^2 \ne O(n)

Solution

Say that n2=O(n)n^2 = O(n), then there are some mm and bb and NN such that for any n>Nn>N we have that n2<mn+bn^2 < mn+b. Dividing both sides by nn we get that n<m+bnn < m + \frac{b}{n}. However, for n>bn>b we have that m+bn<m+1m+\frac{b}{n} < m+1, but m+1m+1 is a constant so we can't have that n<m+1n<m+1 for all sufficiently large nn.

More generally, given two functions ff and gg we can denote f=O(g)f=O(g) to mean that there are some constants mm and bb such that, for sufficiently large nn, it holds that f(n)mg(n)+bf(n)\le m\cdot g(n) + b. This allows us to write stuff like f=O(n2)f = O(n^2) to say "look, I can't tell you exactly how ff grows, but the rate at which it grows is at most quadratic, if you put in an input three times longer, the growth in runtime is at most by a factor of three squared, that is nine".

Exercise: Prove that n2=O(n3)n^2 = O(n^3). More generally, prove that if aa and bb are natural numbers then na=O(nb)n^a = O(n^b) if and only if aba\le b.

Like the big oh notation helps us bound things from above, we have the big omega notation to bound things from below. By writing that f(n)=Ω(n)f(n) = \Omega(n) we say that ff grows at least linearly, but it could actually grow faster, just like a sublinear function is still O(n)O(n).

Exercise: Write a formal definition for f=Ω(g)f=\Omega(g)

Solution

There are some constants mm and bb such that, for sufficiently large nn, it holds that f(n)mg(n)+bf(n)\ge m\cdot g(n) + b.

Finally, we have the theta notation, which we use to say that we know exactly what the asymptotics are. In brief, we write that f=θ(g)f = \theta(g) if it holds that both f=O(g)f = O(g) and f=Ω(g)f =\Omega(g). The notation f=θ(g)f=\theta(g) essentially means "we might not know what ff is, but as long as we talk about growth rate, we can just pretend that it is gg". Typically, gg will be much simpler than ff. For example, ff could be the running time of some complex algorithm, or the probability to revert a transaction in some concrete protocol, and instead of computing ff explicitly we could just prove that, say, f=θ(n2)f = \theta(n^2).

Exercise: A function ff is called bounded if there is some BB such that f(n)<Bf(n) < B for all nn. Let ff be bounded and let gg be any function, prove that fg=θ(g)f\cdot g = \theta(g).

Exercise: Find a function ff that is not linear, but such that f=θ(n)f=\theta(n).

Sometimes we don't care about growth but about decay. That is, we want to know how small ff gets as nn increases. For example, ff could be the probability that an attacker manages to break your encryption as a function of the length of your secret-key. We can do it using the same notation. For example, f(n)=O(1/n)f(n) = O(1/n) is what we mean when we say that a function decays linearly.

Exponentials

Exponentials and logarithms are how we typically describe things that are large or small. Note that I did not end the latest sentence with "respectively", and this is no coincidence. When we talk about growth (that is, how fast things go to infinity), exponential growth is large, while logarithmic growth is very slow (and in between we have e.g. the polynomial growth: linear, quadratic, quartic, etc...). When we talk about decay (that is, how fast things go to zero), then exponential decay is small.

As an example, consider generating a secret-key. In any encryption scheme, a secret-key can be represented as a uniformly random string of bits, where the length is chosen by the user. Say we use a key of length nn, then how many possible keys are there? The first bit is either 00 or 11, so two options. The second bit is either 00 or 11 too, so two more options on top of each of these two options, leading to four options, and so on. It is easy to get convinced that the number of possible keys is 2n2^n. Hence, the number of keys grows exponentially with nn.

Now say I am trying to guess your secret key. Since all the bits are uniformly random, we get that they are all equally likely, so that the probability I guess correctly (no matter how I guess) is one in 2n2^n, that is, 1/2n1/2^n which is also commonly written as 2n2^{-n}. So since the possible number of keys grows exponentially (and the keys are uniform), then the probability to guess a key decays exponentially.

To be more concrete, we say that ff is exponential if there is some constant c>0c>0 such that f=θ(ecn)f=\theta(e^{cn}) where ee is the infamous Euler's number. Why ee and not any other number? Because it is comfortable. The fact is that for any number bb there is a number cc such that bn=ecnb^n = e^{cn}. (This number cc is called the natural logarithm of bb, denoted lnb\ln b, and we will discuss it shortly.) So you can replace ee with any positive number larger than 11 and obtain a completely equivalent definition.

Exercise: Let b>1b>1, let ff satisfy that there is some c>0c>0 such that f=θ(bcn)f=\theta\left(b^{cn}\right), prove that ff is exponential.

Logarithms

Negligible Functions

Last updated