Correlation Coefficient and Linear Regression
Consider estimating a random variable d by a constant b according to the mean square estimation,
where E[.] is the expected value operator (see Appendix A). The best b is obtained by taking the derivative with respect to b and setting the result to zero, yielding b* = E[d]. If we substitute this value in the equation, we obtain the variance of d as the smallest error.
Now if we try to approximate d by wx + b we obtain
and the best b* = E[d]-wE[x]. Therefore the best w can be found solving the problem
It is easy to show by differentiation that the best w*
(1B.4)
Therefore the minimum mean square error linear estimator for d is
(1B.5)
The term
in parentheses is a zero-mean unit variance version of the input x,
so the product with
scales it by the variance of d. The term E[d]
just guarantees the correct mean.
It is
interesting to note that if d and x are
uncorrelated, the best estimate of d is its mean. However,
if x and d are exactly correlated, the best
estimate is highly improved (
in Eq. 1B.5). The minimum mean square
error is
This equation shows that
in fact
can be
interpreted as the amount of variance in the data that is
captured by the linear model.
There is a very interesting interpretation of the mean square estimation solution. Note that
(1B.6)
which means that the error (the quantity inside the curly braes) is orthogonal to the input.
Use your browser's back button to return to text.