Probability Spaces

A probability function P\mathbb{P} takes an event, and tells us how likely it is as a number from 00 to 11. For example, if I roll a (fair) die, the event AA that it came out four has probability P[A]=16\mathbb{P}[A] = \frac{1}{6}.

What is the probability that it will be either three or four? Let AA be the event that it is three, and BB the event that it is four, then the event that it is either is denoted ABA\cup B. Note that it is impossible that die is both three and four, we call such events disjoint. Since AA and BB are disjoint, we get that the probability either happens is the sum of probabilities they do happen. There's a one in six chance to get three, and a one in six chance to get four, hence there's a two in six chance to get either. Or in math:

P[AB]=P[A]+P[B]=16+16=13.\mathbb{P}[A\cup B] = \mathbb{P}[A]+\mathbb{P}[B] = \frac{1}{6} + \frac{1}{6} = \frac{1}{3}\text{.}

When events are not disjoint, we cannot sum them up like this. For example, the probability that the result is odd is three in six, or one-half. The probability that the result is at most three is also three in six, or one-half. So the probability that the result is either odd or at most three is half + half = one? Of course not: there is a positive probability that we roll a four, which is neither odd nor smaller than three.

when summing the two we overlook the fact that a result can be both odd and at most three, making us count these events twice. We said that three in six are odd and three in six are at most three, but three and one are both. We overcounted.

Exercise: Given two events AA and BB, let ABA\cap B be the event that they both happen (if they are disjoint, then ABA\cap B is the empty event \emptyset that satisfies P[]=0\mathbb{P}[\emptyset]=0), convince yourself that

P[AB]=P[A]+P[B]P[AB].\mathbb{P}[A\cup B] = \mathbb{P}[A] + \mathbb{P}[B] - \mathbb{P}[A\cap B]\text{.}
Solution

Intuitively, we have that ABA\cap B is all the events that are in both AA and BB, so we count them once in P[A]\mathbb{P}[A] and a second time in P[B]\mathbb{P}[B], so we have to subtract them once to balance out the double counting.

Formally, we can write A=ABA' = A \setminus B (that is AA' is the event that AA happened and BB didn't), So AB=ABA\cup B = A' \cup B but AA' and BB are disjoint, so we have:

P[AB]=P[AB]=P[A]+P[B]\mathbb{P}[A\cup B] = \mathbb{P}[A' \cup B] = \mathbb{P}[A'] + \mathbb{P}[B]

However, we also have that A=A(AB)A = A' \cup (A\cap B), where the union is also disjoint, so we have that P[A]=P[A]+P[AB]\mathbb{P}[A] = \mathbb{P}[A'] + \mathbb{P}[A\cap B], or equivalently that P[A]=P[A]P[AB]\mathbb{P}[A'] = \mathbb{P}[A] - \mathbb{P}[A\cap B].

Substituting this into the equation above we get

P[AB]=P[A]+P[B]=P[A]+P[B]P[AB]\mathbb{P}[A\cup B] = \mathbb{P}[A' ] + \mathbb{P}[B] = \mathbb{P}[A] + \mathbb{P}[B]-\mathbb{P}[A\cap B]

as needed.

Atomic Events

We say that an event AA contains another event BB, and denote BAB\subseteq A, if AA is guaranteed to happen given that BB happens. For example, the event X=4X=4 is contained in the event that XX is even. An event is called atomic if it has positive probability, but it does not contain any other event that has positive probability. For example, a fair die has six atomic events, for each n=1,,6n=1,\ldots,6 we have the atomic event X=nX=n.

It is conventional to call the set of atomic events Ω\Omega. For example Ω={ω1,,ω6}\Omega = \{\omega_1,\ldots,\omega_6\} describes the atomic events of any six-sided die. The die doesn't have to be fair, but it does have to satisfy that each of its six sides has some probability to be the result.

Exercise: prove that any two different atomic events are disjoint

The set Ω\Omega is also commonly called the "sample space", as it describes all possible results of an experiment.

Finite and Discrete Probability Spaces

Given a set of atomic events Ω\Omega, we can endow them with a probability function a probability function P\mathbb{P} that assigns to each atomic event ωΩ\omega\in\Omega a number 0<P[ω]10<\mathbb{P}[\omega]\le1. The pair (Ω,P)(\Omega,\mathbb{P}) is called a discrete probability space if the probabilities of all atomic events sum to one. In other words, if all events can be described as a union of atomic events.

Note that since atomic events are disjoint, the definition of P\mathbb{P} naturally extends to all events, simply as the sum of atomic events that comprise them. For example, if we AA be the event that the result is at most three, then we have

P[A]=P[ω1ω2ω3]=P[ω1]+P[ω2]+P[ω3]\mathbb{P}[A]=\mathbb{P}[\omega_{1}\cup\omega_{2}\cup\omega_{3}]=\mathbb{P}[\omega_{1}]+\mathbb{P}[\omega_{2}]+\mathbb{P}[\omega_{3}]

If we want to describe a fair die, we will define Ω={1,,6}\Omega = \{1,\ldots,6\} and P[n]=16\mathbb{P}[n] = \frac{1}{6} for any n=1,,6n=1,\ldots,6, and we will indeed get that the probability that the die is less than three is

P[A]=P[ω1ω2ω3]=P[ω1]+P[ω2]+P[ω3]=16+16+16=12\mathbb{P}[A]=\mathbb{P}[\omega_{1}\cup\omega_{2}\cup\omega_{3}]=\mathbb{P}[\omega_{1}]+\mathbb{P}[\omega_{2}]+\mathbb{P}[\omega_{3}]=\frac{1}{6}+\frac{1}{6}+\frac{1}{6}=\frac{1}{2}

With a bit of thought, one can convince herself that if Ω\Omega is finite then any probability space over Ω\Omega is discrete. However, a space can be discrete also if Ω\Omega is infinite. For example, lets say I flip a fair coin util it lands on head, how many flips would it take? Well, for any positive integer nn there is some probability that it will take nn attempts. So our sample space is has an event for any positive natural numbe Ω={ω1,ω2,}\Omega = \{\omega_1,\omega_2,\ldots\}.

The probability that a single flip of a fair coin will land on head is 1/21/2. For it to take exactly two attempts means that the first flip landed on tails and the second landed on heads, which has a probability 1/41/4 (as it is one of four equally likely results of flipping two coins). It would take exactly three attempts with probability 1/81/8 and so on. The pattern is quite obvious: the probability it would take exactly nn flips is (12)n\left(\frac{1}{2}\right)^n, which we can more comfortable write as 2n2^{-n}.

We said that the probabilities sum to one, and indeed if we let AnA_n be the event that it took nn flips, we get that

P[ω1]+P[ω2]+P[ω3]+=12+14+18+=1\mathbb{P}[\omega_1] + \mathbb{P}[\omega_2] + \mathbb{P}[\omega_3] + \ldots = \frac{1}{2} +\frac{1}{4} + \frac{1}{8} + \ldots = 1

as we can easily compute with the geometric sum formula.

We can now ask ourselves, say, what is the probability this would take at most ten tries, and compute that the result is

i=1102n=1210=1023102499.8%\sum_{i=1}^{10}2^{-n} = 1-2^{-10} = \frac{1023}{1024} \approx 99.8\%

Exercise: prove a more generalized version of the equality above, namely that the probability it would take at most nn coin flips is 12(n1)1-2^{-(n-1)}. Hint, recall the finite geometric sum formula: for any qq we have that

1+q+q2++qn=k=0nqk=1qn+11q1+q+q^2 + \ldots + q^n = \sum_{k=0}^n q^k = \frac{1-q^{n+1}}{1-q}
Solution

Setting q=12q = \frac{1}{2} we have that

k=1n12k=k=0n12k1=112n+11121=2(112n+1)1=2212n+11=112n\begin{aligned}\sum_{k=1}^{n}\frac{1}{2^{k}} & =\sum_{k=0}^{n}\frac{1}{2^{k}}-1\\ & =\frac{1-\frac{1}{2^{n+1}}}{1-\frac{1}{2}}-1\\ & =2\left(1-\frac{1}{2^{n+1}}\right)-1\\ & =2-2\cdot\frac{1}{2^{n+1}}-1\\ & =1-\frac{1}{2^{n}} \end{aligned}

We can even look at infinite unions. For example, we can ask ourselves what is the probability that it took us an even number of attempts? One can compute that this comes out to

n=1ω2n=n=1122n=n=114n=13\sum_{n=1}^{\infty}\omega_{2n}=\sum_{n=1}^{\infty}\frac{1}{2^{2n}}=\sum_{n=1}^{\infty}\frac{1}{4^{n}}=\frac{1}{3}

Exercise: prove the above inequality. Hint, recall the infinite geometric sum formula: if 1<q<1-1<q<1 then

1+q+q2+=k=0qk=11q1+q+q^2 + \ldots = \sum_{k=0}^\infty q^k = \frac{1}{1-q}

Remark: When is a probability space not discrete? We won't talk about them much, and definitely not introduce them formally, but there could be continuous probability spaces. For example, we can uniformly sample real number between 00 and 11. One can easily see that the probability to sample any particular number is 00. Worse yet, for any 0a<b10\le a<b \le 1 the event "we sampled a number between aa and bb" is not atomic, as it strictly contains the event "we sampled a number between aa' and bb'" whenever a<a<b<ba<a'<b'<b. So such events are never atomic, and a similar approach shows that, in fact, there are no atomic events at all in this space. This means that our definitions are too narrow to expend to the continuous case, and while that definitely can be done, we have no reason to do so.

Last updated