2.5 Pieces of a Random Variable

Definition 2.2 (Random Variable, Suite) A random variable is a function \(X : \Omega \rightarrow \mathbb{R},\) such that \(\forall r \in \mathbb{R}, \{\omega \in \Omega\}: X(\omega) \leq r\} \in S\).

There are two key pieces that must exist for every random variable. What are these pieces? The first of these pieces is provided to us in Definition 1.2.1 Random Variable (on page 16). The second is provided to us in Definition 1.2.5 Probability Mass Function (on page 18).

Suppose that a random variable is simple and discrete. For concreteness, you could think of this random variable as the answer to the question, “Is the grass wet outside?”.

  1. What is the sample space?
  2. What is a sensible function that you might use to map from the sample space to real values?
  3. What is a insensible function that you might use to map from the sample space to real values? (A student well-seasoned in Maths might use (and define for the rest of the class) the concept of a bijective function).
  4. If you simply had the values that the random variable function maps to are you guaranteed to be able to describe the entire sample space? Why or why not?
  5. How would you go about determining the probability mass function for this random variable?

2.5.1 Functions of Functions

Why do we say that random variables are functions? Is there some useful property of these being functions rather than any other quantity? What else could they be if not a function?

What about a function of a random variable, which is a function of a function.

Definition 2.3 (Function of a Random Variable) Let \(g : U \rightarrow \mathbb{R}\) be some function, where \(X(\Omega) \subset U \subset \mathbb{R}\). Then, if \(g \circ X : \Omega \rightarrow \mathbb{R}\) is a random variable, we say that \(g\) is a function of X and write \(g(X)\) to denote the random variable \(g \circ X\).

If a random variable is a function from the real world, or the sample space, or the outcome space to a real number, then what does it mean to define a function of a random variable?

  • At what point does this function work? Does this function change the sample space that is possible to observe? Or, does this function change the real-number that each outcome points to?

Example 2.1 (MNIST) Suppose that you are doing some image processing work. To keep things simple, that you are doing image classification in the style of the MNIST dataset.

  • Can someone describe what this task is trying to accomplish?
  • Has anyone done work like this?

However, suppose that rather than having good clean indicators for whether a pixel is on or off, instead you have weak indicators – there’s a lot of grey. A lot of the cells are marked in the range \(0.2 - 0.3\).

  1. How might creating a function that re-maps this grey into more extreme values help your model?
  2. Is it possible to “blur” events that are in the outcome space? Does this “blurring” meet the requirements of a function of a random variable, as provided above?

2.5.2 Probability Density Functions and Cumulative Distribution Functions

  • What is a probability mass function?
  • What do the Kolmogorov Axioms mean must be true about any probability mass function (pmf)?

Example 2.2 (Berkeley Drivers, No Survivors) You should try driving in Berkeley some time. It is a trip! Without being deliberately ageist, the city is full of ageing hippies driving Subaru Outbacks and making what seem to be stochastic right-or-left turns to buy incense, pottery, or just sourdough bread.

Suppose that you are walking to campus, and you have to cross 10 crosswalks, each of which are spaced a block apart. Further, suppose that as you get closer to campus, there are fewer aging hippies, and therefore, there is decreasing risk that you’re hit by a Subaru as you cross the street. Specifically, and fortunately for our math, the risk of being hit decreases linearly with each block that you cross.

Finally, campus provides you with the safety reports from last year, and reports that there were 120 student-Subaru incidents last year, out of 10,000 student-crosswalk crossings.

  1. What is the pmf for the probability that you are involved in a student-Subaru incident as you walk across these 10 blocks? What sample space, \(\Omega\) is appropriate to represent this scenario?
  2. Suppose that you don’t leave your house – this is a remote program after all! What is your cumulative probability of being involved in a student-subaru incident?
  3. What is the cumulative probability cmf for the probability that you are involved in a student-Subaru incident?
  4. Suppose that you live three blocks from campus, but your classmate lives five blocks from campus. What is the difference in the cumulative probability?
  5. How would you describe the cumulative probability of being hit as you walk closer to campus? That is, suppose that you start 10 blocks away from campus, and are walking to get closer. Is your cumulative probability of being hit on your way to campus increasing or decreasing as you get closer to campus?
  6. How would you describe the cumulative probability of being hit as you walk further from campus? That is, suppose that you start on campus, and you’re walking to a bar after classes. Is your cumulative probability of being hit on your way away from campus increasing or decreasing as you get further from campus?