Revision as of 09:31, 20 May 2013 by Rhea (Talk | contribs)

BPE FOR MULTIVARIATE GAUSSIAN

Complement to Lecture 7: Maximum Likelihood Estimation and Bayesian Parameter Estimation, ECE662, Spring 2010, Prof. Boutin


Estimation of mean, given a known covariance

Consider a set of iid samples $ \{X_i\}_{i=1}^N $ where $ X_i \in\mathbb{R}^n $ is such that $ X_i \sim N(\mu,\Sigma) $. Suppose we know $ \Sigma $, but wish to estimate $ \mu $ using BPE. If we assume a prior distribution for the unknown mean to be distributed as a Gaussian random variable, we will obtain a posterior distribution for the mean which is also Gaussian, i.e. $ p(\mu|X_1,X_2,\ldots,X_N) = N(\mu_N,\Sigma_N) $, where $ \mu_N $ and $ \Sigma_N $ are calculated to utilize both our prior knowledge of $ \mu $ and the samples $ \{X_i\}_{i=1}^N $. Fukunaga p. 391 derives that the parameters $ \mu_N $ and $ \Sigma_N $ are calculated as follows:

$ \mu_N = \frac{\Sigma}{N}(\Sigma_\mu + \frac{\Sigma}{N})^{-1}\mu_0 + \Sigma_\mu(\Sigma_\mu + \frac{\Sigma}{N})^{-1}\left(\frac1N\sum_{i=1}^NX_i\right) $,

where $ \mu_0 $ is the initial "guess" for the mean $ \mu $, and $ \Sigma_\mu $ is the "confidence" in that guess. In other words, we can consider that $ N(\mu_0,\Sigma_\mu) $ is the prior distribution for $ \mu $ that we would assume without seeing any samples. For the covariance parameter, we have

$ \Sigma_N = \Sigma_0(\Sigma_0+\frac{\Sigma}{N})^{-1}\frac{\Sigma}{N} $.

We find that as the number of samples increases, that the effect of the prior knowledge ($ \mu_0 $,$ \Sigma_\mu $) decreases so that

$ \lim_{N\rightarrow\infty}\mu_N = \frac1N\sum_{i=1}^NX_i $, and $ \lim_{N\rightarrow\infty}\Sigma_N = 0 $.

Estimation of covariance, given a known mean

Again, given iid samples $ \{X_i\}_{i=1}^N $, $ X_i \in\mathbb{R}^n $, $ X_i \sim N(\mu,\Sigma) $, let us now estimate $ \Sigma $ with $ \mu $ known. As in Fukunaga p. 392, we assume that both the posterior distribution of $ \Sigma $ is normal (i.e. $ p(X|\Sigma) = N(\mu,\Sigma) $), and it can be shown that the sample covariance matrix follows a Wishart Distribution. Fukunaga p.392 shows the distribution $ p(K|\Sigma_0,N_0) $, where $ K = \Sigma^{-1} $, and parameter $ \Sigma_0 $ represents the initial "guess" for $ \Sigma $ and $ N_0 $ represents "how many samples were used to compute $ \Sigma_0 $". Note that we compute the distribution for $ K = \Sigma^{-1} $ instead of $ \Sigma $ directly, since the inverse covariance matrix is used in the definition for a normal distribution. It can be shown, then, that

$ p(K|\Sigma_0,N_0) = c(n,N_0)\left|\frac12N_0\Sigma_0\right|^{(N_0-1)/2}|K|^{(N_0-n-2)/2}\exp(-\frac12\mathrm{trace}(N_0\Sigma_0K)) $,

where $ c(n,N_0) = \left\{\pi^{n(n-1)/4}\prod_{i=1}^n\Gamma\left(\frac{N_0-i}{2}\right)\right\}^{-1} $.

Simultaneous estimation of unknown mean and covariance

Finally, given iid samples $ \{X_i\}_{i=1}^N $, $ X_i \in\mathbb{R}^n $, $ X_i \sim N(\mu,\Sigma) $, we now wish to estimate both $ \mu $ and $ \Sigma $ (or $ K = \Sigma^{-1} $). Fukunaga p. 393 gives that the joint distribution follows the Gauss-Wishart distribution as follows

$ p(\mu,K|\mu_0,\Sigma_0,\mu_{\Sigma},N_0) = (2\pi)^{-n/2}|\mu_{\Sigma} K|^{1/2}\exp\left(-\frac12\mu_{\Sigma}(\mu-\mu_0)^TK(\mu-\mu_0) \right)\times c(n,N_0)|\frac12N_0\Sigma_0|^{(N_0-1)/2}|K|^{(N_0-n-2)/2}\exp\left(-\frac12\mathrm{trace}(N_0\Sigma_0K\right) $, where $ \mu_0 $, $ \Sigma_0 $, $ N_0 $, and $ c(n,N_0) $ are as above.


Back to Lecture 7: Maximum Likelihood Estimation and Bayesian Parameter Estimation, ECE662, Spring 2010, Prof. Boutin

Alumni Liaison

Abstract algebra continues the conceptual developments of linear algebra, on an even grander scale.

Dr. Paul Garrett