8.2 Asymptotic Normality
We start by revealing an alternative expression for the OLS estimators \(\widehat{{\boldsymbol{\beta}}}\) using matrix notation.
\[ \begin{aligned} \widehat{{\boldsymbol{\beta}}} &=\left[{\mathbb{X}}^T{\mathbb{X}}\right]^{-1}{\mathbb{X}}^T{\boldsymbol{Y}} \\ &=\left[{\mathbb{X}}^T{\mathbb{X}}\right]^{-1}{\mathbb{X}}^T({\mathbb{X}}{\boldsymbol{\beta}}+{\boldsymbol{\varepsilon}}) \\ &=\left[{\mathbb{X}}^T{\mathbb{X}}\right]^{-1}({\mathbb{X}}^T{\mathbb{X}}){\boldsymbol{\beta}}+ \left[{\mathbb{X}}^T{\mathbb{X}}\right]^{-1}{\mathbb{X}}^T{\boldsymbol{\varepsilon}} \\ &={\boldsymbol{\beta}} + \left[{\mathbb{X}}^T{\mathbb{X}}\right]^{-1}{\mathbb{X}}^T{\boldsymbol{\varepsilon}} \end{aligned} \]
So, \[\begin{equation} \widehat{{\boldsymbol{\beta}}}-{\boldsymbol{\beta}} = \left[{\mathbb{X}}^T{\mathbb{X}}\right]^{-1}{\mathbb{X}}^T{\boldsymbol{\varepsilon}} \tag{8.1} \end{equation}\]
We can then multiply by \(\sqrt{n}\) both sides of Equation (8.1) to get \[ \begin{aligned} \sqrt{n}\left(\widehat{{\boldsymbol{\beta}}}-{\boldsymbol{\beta}}\right) &=\left( \frac{1}{n}\sum\limits_{i=1}^n{\boldsymbol{X}}_i^T{\boldsymbol{X}}_i \right)^{-1} \left( \frac{1}{\sqrt{n}}\sum\limits_{i=1}^n{\boldsymbol{X}}_i^T\varepsilon_i \right) \\ &=\widehat{{\mathbb{Q}}}_{{\boldsymbol{XX}}}^{-1} \left( \frac{1}{\sqrt{n}}\sum\limits_{i=1}^n{\boldsymbol{X}}_i^T\varepsilon_i \right) \end{aligned} \] From the consistency of OLS estimators, we already have \[ \widehat{{\mathbb{Q}}}_{{\boldsymbol{XX}}}\xrightarrow[p]{\quad\quad}{\mathbb{Q}}_{{\boldsymbol{XX}}}\] Our aim now is to understand the distribution of the stochastic term (the second term) in the above expression.
We first note (from i.i.d. and Theorem 4.3) that \[ {\mathbb{E}\left[ {\boldsymbol{X}}_i^T\varepsilon_i \right]}={\mathbb{E}\left[ {\boldsymbol{X}}^T\varepsilon \right]}={\boldsymbol{0}}. \] Let us compute the covariance matrix of \({\boldsymbol{X}}_i\varepsilon_i\). Since the expectation vector is zero, we have \[ {\mathbb{V}}[{\boldsymbol{X}}_i^T\varepsilon_i]={\mathbb{E}\left[ {\boldsymbol{X}}_i^T\varepsilon_i\left({\boldsymbol{X}}_i^T\varepsilon_i\right)^T \right]}={\mathbb{E}\left[ {\boldsymbol{X}}^T{\boldsymbol{X}}\varepsilon^2 \right]}\stackrel{\text{def}}{=}{\mathbb{A}}. \] As any function of \(\{(Y_i,{\boldsymbol{X}}_i)\}\)’s are independent, \(\{{\boldsymbol{X}}_i\varepsilon_i\}\)’s are independent. By the (multivariate) Central Limit Theorem, as \(n\to\infty\) \[ \frac{1}{\sqrt{n}}\sum\limits_{i=1}^n{\boldsymbol{X}}_i^T\varepsilon_i \xrightarrow[d]{\quad\quad}\mathcal{N}({\boldsymbol{0}},{\mathbb{A}}). \] There is a small technicality here, we must have \({\mathbb{A}}<\infty\). This can be imposed by a stronger regularity condition on the moments, e.g., \({\mathbb{E}\left[ Y^4 \right]},{\mathbb{E}\left[ ||{\boldsymbol{X}}||^4 \right]}<\infty\). Putting everything together, we conclude \[ \sqrt{n}(\widehat{{\boldsymbol{\beta}}}-{\boldsymbol{\beta}})\xrightarrow[d]{\quad\quad} {\mathbb{Q}}_{{\boldsymbol{XX}}}^{-1}\mathcal{N}({\boldsymbol{0}},{\mathbb{A}}) =\mathcal{N}\left({\boldsymbol{0}},\left[{\mathbb{Q}}_{{\boldsymbol{XX}}}^{-1}\right]^T{\mathbb{A}}{\mathbb{Q}}_{{\boldsymbol{XX}}}^{-1}\right) =\mathcal{N}\left({\boldsymbol{0}},{\mathbb{Q}}_{{\boldsymbol{XX}}}^{-1}{\mathbb{A}}{\mathbb{Q}}_{{\boldsymbol{XX}}}^{-1}\right) \]
Theorem 8.1 (Asymptotic Distribution of OLS Estimators) We assume the following:
1. The observations \(\{(Y_i,{\boldsymbol{X}}_i)\}_{i=1}^n\) are i.i.d from the joint
distribution of \((Y,{\boldsymbol{X}})\)
2. \({\mathbb{E}\left[ Y^4 \right]}<\infty\)
3. \({\mathbb{E}\left[ ||{\boldsymbol{X}}||^4 \right]}<\infty\)
4. \({\mathbb{Q}}_{{\boldsymbol{XX}}}={\mathbb{E}\left[ {\boldsymbol{X}}{\boldsymbol{X}}' \right]}\) is positive-definite.
Under these assumptions, as \(n\to\infty\)
\[
\sqrt{n}(\widehat{{\boldsymbol{\beta}}}-{\boldsymbol{\beta}})\xrightarrow[d]{\quad\quad}
\mathcal{N}\left({\boldsymbol{0}},{\mathbb{V}}_{{\boldsymbol{\beta}}}\right),
\]
where
\[{\mathbb{V}}_{{\boldsymbol{\beta}}}\stackrel{\text{def}}{=}{\mathbb{Q}}_{{\boldsymbol{XX}}}^{-1}{\mathbb{A}}{\mathbb{Q}}_{{\boldsymbol{XX}}}^{-1}\]
and \({\mathbb{Q}}_{{\boldsymbol{XX}}}={\mathbb{E}\left[ {\boldsymbol{X}}^T{\boldsymbol{X}} \right]}\), \({\mathbb{A}}={\mathbb{E}\left[ {\boldsymbol{X}}^T{\boldsymbol{X}}\varepsilon^2 \right]}\).
The covariance matrix \({\mathbb{V}}_{{\boldsymbol{\beta}}}\) is called the asymptotic variance matrix of \(\widehat{{\boldsymbol{\beta}}}\). The matrix is sometimes referred to as the sandwich form.