5.4 Estimator Property: Biased or Unbiased?
- First, for a general case: Suppose that you have chosen some particular estimator, \(\hat{\theta}\) to estimate some characteristic, \(\theta\) of a random variable. How do you know if this estimator is unbiased?
- Second, for a specific case: Define the “sample average” to be the following: \(\frac{1}{n}\sum_{i=1}^{N} x_{i}\). Prove that this sample average estimator is an unbiased estimator of \(E[X]\).
- Third (easier), for a different specific case: Define the “smample smaverage” to be the following \(\frac{1}{n^2}\sum_{i=1}^{N} x_{i}\). Prove that the smample smaverage is a biased estimator of \(E[X]\).
- Fourth (harder): Define the geometric mean to be \[\left(\prod_{i=1}^{N}x_{i}\right)^{\frac{1}{N}}\]. Prove that the geometric mean is a biased estimator of \(E[X]\).
5.4.1 Is it unbiased, with data?
Suppose that you’re getting data from the following process:
Code
random_distribution <- function(number_samples) {
d1 <- c(1.0, 2.0)
d2 <- c(1.1, 2.1)
d3 <- c(1.5, 2.5)
distribution_chooser = sample(x=1:3, size=1)
if(distribution_chooser == 1) {
x_ <- runif(n=number_samples, min=d1[1], max=d1[2])
} else if(distribution_chooser == 2) {
x_ <- runif(n=number_samples, min=d2[1], max=d2[2])
} else if(distribution_chooser == 3) {
x_ <- runif(n=number_samples, min=d3[1], max=d3[2])
}
return(x_)
}
random_distribution(number_samples=10)
## [1] 2.467197 2.366916 1.937715 1.691938 1.582294 2.083452 1.570361 2.027663
## [9] 1.972288 1.548191
## [1] 1.501582
Notice that, there are two forms of inherent uncertainty in this function:
- There is uncertainty about the distribution that we are getting draws from; and,
- Within a distribution, we’re getting draws at random from a population distribution.
This class of function, the r*
functions, are the implementation of random generative processes within the R language. Look into ?distributions
as a class to see more about this process.
Suppose that you chose to use the same sample average estimator as a means of producing an estimate of the population expected value, \(E[X]\). Suppose that you get the following draws:
## [1] 1.946123 1.913084 1.711113 1.768129 1.573630 1.916366 1.149307 1.556135
## [9] 1.358160 1.889717
## [1] 1.678176
Is this sample average an unbiased estimator for the population expected value? How do you know?