2.8 Joint Density

Working with a single random variable helps to develop our understanding of how to relate the different features of a pdf and a cdf through differentiation and integration. However, there’s not really that much else that we can do; and, there is probably very little in our professional worlds that would look like a single random variable in isolation.

We really start to get to something useful when we consider joint density functions. Joint density functions describe the probability that both of two random variables. That is, if we are working with random variables \(X\) and \(Y\), then the joint density function provides a probability statement for \(P(X \cap Y)\).

In this course, we might typically write this joint density function as \(f_{X,Y}(x,y) = f(\cdot)\) where \(f(\cdot)\) is the actual function that represents the joint probability. The \(f(\cdot)\) means, essentially, “some function” where we just have not designated the specifics of the function; you might think of this as a generic function.

2.8.1 Example: Uniform Joint Density

Suppose that we know that two variables, \(X\) and \(Y\) are jointly uniformly distributed within the the support \(x \in [0,4], y \in [0,4]\). We have a requirement, imposed by the Kolmogorov Axioms that all probabilities must be non-zero, and that the total probability across the whole support must be one.

  • Can you use these facts to determine answers to the following:
    • What kind of shape does this joint pdf have?
    • What is the specific function that describes this shape?
    • If you draw this shape on three axes, and \(X\), and \(Y\), and a \(P(X,Y)\), what does this plot look like?
    • How do you get from the joint density function, to a marginal density function for \(X\)?
    • How do you get form the joint density function, to a marginal density function for \(Y\)?
    • How do you get from these marginal density functions of \(X\) and \(Y\) back to the joint density? Is this always possible?

2.8.2 Examples: Thinking Through Many Plots

An alumni of the MIDS program, and a former instructor of this course, Todd Young built this nifty tool that lets us consider several different joint probability functions.

As a class, lets consider a few of these PDFs, beginning with this “triangle” distribution.

Code
knitr::include_app('http://www.statistics.wtf/PDF_Explorer/', height="1000px")

2.8.3 Triangle Math

After considering the intuition for the triangle distribution, do the following: Write down the function that accords with the figure that you’re seeing above.2

  • What is a full statement of the PDF of this image?
  • What is the marginal distribution of \(X\), \(f_{X}(x)\)?
  • What is the marginal distribution of \(Y\), \(f_{Y}(y)\)?
  • Using the definition of independence, are \(X\) and \(Y\) independent of each other?
  • What is the CDF of \(X\), \(F_{X}(x)\)?

2.8.4 Saddle Sores

Suppose that you know that two random variables, \(X\) and \(Y\) are jointly distributed with the following pdf:

\[ f_{X,Y}(x,y) = \begin{cases} a * x^{2} * y^{2} & 0 < x < 1, 0 < y < 1 \\ 0 & otherwise \end{cases} \]

This joint pdf is similar to the pdf that you can visualize above, under the distribution called “saddle”. The difference between this function and the image above is that the function bounds the with support of \(x\) and \(y\) on the range \([0,1]\). This is to make the math easier for us in the next step.

  • Can you use these facts to determine the following?
    • What value of \(a\) makes this a valid joint pdf?
    • What is the marginal pdf of \(x\)? That is, what is \(f_{x}(x)\)?
    • What is the conditional pdf of \(X\) given \(Y\)? That is, what is \(f_{x|y}(x,y)\)?
    • Given these facts, would you say that \(X\) and \(Y\) are dependent or independent?
    • If the support for this joint distribution were instead \([0,4]\) (rather than \([0,1]\)), how would the shape of the distribution change?

  1. Notice, that in general, this kind of curve fitting isn’t really a common data science task. Instead, this is just a learning task that lets the class assess their understanding of the definitions of random variables.↩︎