1.12 Minimum Mean-Square Error Estimation
Let $ \mathbf{X} $ and $ \mathbf{Y} $ be two jointly-distributed RVs, suppose we want to estimate the value of $ \mathbf{Y} $ given the value of $ \mathbf{X} $ (i.e. given that we observe$ \left\{ \mathbf{X}=x\right\} $ ). What is the “best” estimate of $ \mathbf{Y} $ ? One commonly used error criterion is square-error. The goal then becomes to minimize the mean-square error. We wish to find a function $ c\left(x\right) $ to estimate $ \mathbf{Y} $ given that $ \mathbf{X}=x $ such that $ \epsilon=E\left[\left(\mathbf{Y}-c\left(\mathbf{X}\right)\right)^{2}\right] $ is minimized.
Claim
The mean-square error is minimized by the function $ c\left(x\right)=E\left[\mathbf{Y}|\mathbf{X}=x\right] $.
We will use the following notation.
$ \hat{y}_{MMS}\left(x\right)=E\left[\mathbf{Y}|\mathbf{X}=x\right] $
$ \hat{x}_{MMS}\left(y\right)=E\left[\mathbf{X}|\mathbf{Y}=y\right] $
Maximum Aposteriori Probability estimator
$ \hat{y}_{MAP}\left(x\right)=\arg\max_{y}\left\{ f_{\mathbf{Y}}\left(y|x\right)\right\} $
$ \hat{x}_{MAP}\left(y\right)=\arg\max_{x}\left\{ f_{\mathbf{X}}\left(x|y\right)\right\} $