Practice Problems

Part 1

The first part of the proof is almost exactly the same as that in the textbook, except that there is only one first-order condition, not two. The optimization is,

$\beta_1 = \text{argmin}_{b_1} E[\epsilon^2]$

The single first-order condition is therefore,

$0 = \frac{\partial E[\epsilon^2]}{\partial \beta_1}$

The rest of the computation is the same:

$0 = \frac{\partial E[\epsilon^2]}{\partial \beta_1} = E\left[ \frac{\partial \epsilon^2}{\partial \beta_1} \right] = E\left[ 2\epsilon \frac{\partial \epsilon}{\partial \beta_1} \right] = -2E[\epsilon X]$

Notice that in the last step above, the partial of error with respect to the slope is the same as in the textbook, even though the predictor is different.

$\frac{\partial \epsilon}{\partial \beta_1} = \frac{\partial Y - \beta_1 X}{\partial \beta_1} = -X$

We can therefore conclude that $E[\epsilon X] = 0$.

Part 2

Is it still true that $Cov[\epsilon, X] = 0$? First some intuition: we know $Cov[\epsilon, X] = E[\epsilon X] - E[\epsilon]E[X] = - E[\epsilon]E[X]$. In regular regression (and the textbook), the other first-order condition tells us that $E[\epsilon] = 0$, but we don't have that first-order condition here, so you might suspect that the covariance may not always equal zero. Indeed, the statement is sometimes false.

For a counter-example, consider the discrete distribution with probability mass function $f$ given by,

$$f(x,y) = \begin{cases}1/2, &(x,y) \in \{ (1,0), (0,1) \}\\ 0 &otherwise\end{cases}$$

We can compute the following:

$$E[XY] = \sum_{(x,y) \in Supp[X,Y]} xy f(x,y) = 1 \cdot 0 \cdot \frac{1}{2} + 0 \cdot 1 \cdot \frac{1}{2} = 0$$

Plugging into the slope of the line,$$\beta_1 = \frac{E[XY]}{E[X^2]} = 0$$

We find that the regression line has zero slope. The covariance can then be computed as follows:

$$Cov[\epsilon, X] = Cov[Y - 0 X, X] = Cov[Y,X] = E[XY] - E[X]E[Y] = 0 - \frac{1}{2}\frac{1}{2}= - \frac{1}{4}$$

Thus, we note that the covariance is not zero.

Part 3

Solving the first-order condition now looks like the following:

$$\begin{align}0 &= E[\epsilon X] \\&= E[(Y - \beta_1 X)X] \\&= E[XY - \beta_1 X^2] \\&= E[EY] - \beta_1 E[X^2] = 0 \\\beta_1 &= \frac{E[XY]}{E[X^2]}\end{align}$$