1.7 Axiomatic Probability

The book makes a point of defining our axioms of probability, calling them them

Definition 1.1 Kolmogorov Axioms

Let \(\Omega\) be a sample space, \(S\) be an event space, and \(P\) be a probability measure. Then, \((\Omega, S, P)\) is a probability space if it satisfies the following:

  • Non-negativity: \(\forall A \in S, P(A) \geq 0\), where \(P(A)\) is finite and real.
  • Unitarity: \(P(\Omega)=1\).
  • Countable additivity: if \(A_1, A_2, A_3, \dots \in S\) are pairwise disjoint, then

\[ P(A_1 \cup A_2 \cup A_3 \cup \dots) = P(A_1) + P(A_2) + P(A_3) = \sum_{i}P(A_{i}) \]

There is a lot going on in this definition!

First things first, these are the axioms of probability (read aloud in the booming voice of a god).

This means that these are things that we begin from, sort of the foundational principles of the entire system of reasoning that we are going to use. In the style of argument that we’re going to make, these are things that are sort of off-limits to question. Instead, these serve as the grounding assumptions, and we see what happens as we flow forward from these statements.

Second, and importantly, from these axioms there are a very large set of things that we can build. The first set of things that we will build are probability statements about atomic outcomes (Theorem 1.1.4 in the book), and collections of events. But, these statements, are not the only thing that we’re limited to. We can also build Frequentist Statistics, and Bayesian Statistics and Language Models.

In many ways, these axioms are the fundamental particles that hold our system of probabilistic reasoning together. These are to probability what the fermions and and bosons are to physics.