Difference between revisions of "Bayesian Parameter Estimation OldKiwi" - Rhea

Latest revision as of 12:21, 6 March 2008

`BPE - Bayesian Parameter Estimation from Lecture 7 <https://engineering.purdue.edu/people/mireille.boutin.1/ECE301kiwi/Lecture7>`_

BPE FOR MULTIVARIATE GAUSSIAN :

Estimation of mean, given a known covariance

Consider a set of iid samples $\{X_i\}_{i=1}^N$ where $X_i \in\mathbb{R}^n$ is such that $X_i \sim N(\mu,\Sigma)$ . Suppose we know $\Sigma$ , but wish to estimate $\mu$ using BPE. If we assume a prior distribution for the unknown mean to be distributed as a Gaussian random variable, we will obtain a posterior distribution for the mean which is also Gaussian, i.e. $p(\mu|X_1,X_2,\ldots,X_N) = N(\mu_N,\Sigma_N)$ , where $\mu_N$ and $\Sigma_N$ are calculated to utilize both our prior knowledge of $\mu$ and the samples $\{X_i\}_{i=1}^N$ . Fukunaga p. 391 derives that the parameters $\mu_N$ and $\Sigma_N$ are calculated as follows:

$\mu_N = \frac{\Sigma}{N}(\Sigma_\mu + \frac{\Sigma}{N})^{-1}\mu_0 + \Sigma_\mu(\Sigma_\mu + \frac{\Sigma}{N})^{-1}\left(\frac1N\sum_{i=1}^NX_i\right)$ ,

where $\mu_0$ is the initial "guess" for the mean $\mu$ , and $\Sigma_\mu$ is the "confidence" in that guess. In other words, we can consider that $N(\mu_0,\Sigma_\mu)$ is the prior distribution for $\mu$ that we would assume without seeing any samples. For the covariance parameter, we have

$\Sigma_N = \Sigma_0(\Sigma_0+\frac{\Sigma}{N})^{-1}\frac{\Sigma}{N}$ .

We find that as the number of samples increases, that the effect of the prior knowledge ( $\mu_0$ , $\Sigma_\mu$ ) decreases so that

$\lim_{N\rightarrow\infty}\mu_N = \frac1N\sum_{i=1}^NX_i$ , and $\lim_{N\rightarrow\infty}\Sigma_N = 0$ .

Estimation of covariance, given a known mean

Again, given iid samples $\{X_i\}_{i=1}^N$ , $X_i \in\mathbb{R}^n$ , $X_i \sim N(\mu,\Sigma)$ , let us now estimate $\Sigma$ with $\mu$ known. As in Fukunaga p. 392, we assume that both the posterior distribution of $\Sigma$ is normal (i.e. $p(X|\Sigma) = N(\mu,\Sigma)$ ), and it can be shown that the sample covariance matrix follows a Wishart Distribution. Fukunaga p.392 shows the distribution $p(K|\Sigma_0,N_0)$ , where $K = \Sigma^{-1}$ , and parameter $\Sigma_0$ represents the initial "guess" for $\Sigma$ and $$ N_0 $$ represents "how many samples were used to compute $\Sigma_0$ ". Note that we compute the distribution for $K = \Sigma^{-1}$ instead of $\Sigma$ directly, since the inverse covariance matrix is used in the definition for a normal distribution. It can be shown, then, that

$p(K|\Sigma_0,N_0) = c(n,N_0)\left|\frac12N_0\Sigma_0\right|^{(N_0-1)/2}|K|^{(N_0-n-2)/2}\exp(-\frac12\mathrm{trace}(N_0\Sigma_0K))$ ,

where $c(n,N_0) = \left\{\pi^{n(n-1)/4}\prod_{i=1}^n\Gamma\left(\frac{N_0-i}{2}\right)\right\}^{-1}$ .

Simultaneous estimation of unknown mean and covariance

Finally, given iid samples $\{X_i\}_{i=1}^N$ , $X_i \in\mathbb{R}^n$ , $X_i \sim N(\mu,\Sigma)$ , we now wish to estimate both $\mu$ and $\Sigma$ (or $K = \Sigma^{-1}$ ). Fukunaga p. 393 gives that the joint distribution follows the Gauss-Wishart distribution as follows

$p(\mu,K|\mu_0,\Sigma_0,\mu_{\Sigma},N_0) = (2\pi)^{-n/2}|\mu_{\Sigma} K|^{1/2}\exp\left(-\frac12\mu_{\Sigma}(\mu-\mu_0)^TK(\mu-\mu_0) \right)\times c(n,N_0)|\frac12N_0\Sigma_0|^{(N_0-1)/2}|K|^{(N_0-n-2)/2}\exp\left(-\frac12\mathrm{trace}(N_0\Sigma_0K\right)$ , where $\mu_0$ , $\Sigma_0$ , $$ N_0 $$ , and $$ c(n,N_0) $$ are as above.

@@ Line 16: / Line 16: @@
 == Estimation of covariance, given a known mean ==
-Again, given iid samples <math>\{X_i\}_{i=1}^N</math>, <math>X_i \in\mathbb{R}^n</math>, <math>X_i \sim N(\mu,\Sigma)</math>, let us now estimate <math>\Sigma</math> with <math>\mu</math> known.  As in Fukinaga p. 392, we assume that both the posterior distribution of <math>\Sigma</math> is normal (i.e. <math>p(X|\Sigma) = N(\mu,\Sigma)</math>), and it can be shown that the sample covariance matrix follows a Wishart Distribution.  Fukinaga p.392 shows the distribution <math>p(K|\Sigma_0,N_0)</math>, where <math>K = \Sigma^{-1}</math>, and parameter <math>\Sigma_0</math> represents the initial "guess" for <math>\Sigma</math> and <math>N_0</math> represents "how many samples were used to compute <math>\Sigma_0</math>".  Note that we compute the distribution for <math>K = \Sigma^{-1}</math> instead of <math>\Sigma</math> directly, since the inverse covariance matrix is used in the definition for a normal distribution.  It can be shown, then, that
+Again, given iid samples <math>\{X_i\}_{i=1}^N</math>, <math>X_i \in\mathbb{R}^n</math>, <math>X_i \sim N(\mu,\Sigma)</math>, let us now estimate <math>\Sigma</math> with <math>\mu</math> known.  As in Fukunaga p. 392, we assume that both the posterior distribution of <math>\Sigma</math> is normal (i.e. <math>p(X|\Sigma) = N(\mu,\Sigma)</math>), and it can be shown that the sample covariance matrix follows a Wishart Distribution.  Fukunaga p.392 shows the distribution <math>p(K|\Sigma_0,N_0)</math>, where <math>K = \Sigma^{-1}</math>, and parameter <math>\Sigma_0</math> represents the initial "guess" for <math>\Sigma</math> and <math>N_0</math> represents "how many samples were used to compute <math>\Sigma_0</math>".  Note that we compute the distribution for <math>K = \Sigma^{-1}</math> instead of <math>\Sigma</math> directly, since the inverse covariance matrix is used in the definition for a normal distribution.  It can be shown, then, that
 <math>p(K|\Sigma_0,N_0) = c(n,N_0)\left|\frac12N_0\Sigma_0\right|^{(N_0-1)/2}|K|^{(N_0-n-2)/2}\exp(-\frac12\mathrm{trace}(N_0\Sigma_0K))</math>,
@@ Line 23: / Line 23: @@
 == Simultaneous estimation of unknown mean and covariance ==
+Finally, given iid samples <math>\{X_i\}_{i=1}^N</math>, <math>X_i \in\mathbb{R}^n</math>, <math>X_i \sim N(\mu,\Sigma)</math>, we now wish to estimate both <math>\mu</math> and <math>\Sigma</math> (or <math>K = \Sigma^{-1}</math>).  Fukunaga p. 393 gives that the joint distribution follows the Gauss-Wishart distribution as follows
+<math>p(\mu,K|\mu_0,\Sigma_0,\mu_{\Sigma},N_0) = (2\pi)^{-n/2}|\mu_{\Sigma} K|^{1/2}\exp\left(-\frac12\mu_{\Sigma}(\mu-\mu_0)^TK(\mu-\mu_0) \right)\times c(n,N_0)|\frac12N_0\Sigma_0|^{(N_0-1)/2}|K|^{(N_0-n-2)/2}\exp\left(-\frac12\mathrm{trace}(N_0\Sigma_0K\right)</math>,
+where <math>\mu_0</math>, <math>\Sigma_0</math>, <math>N_0</math>, and <math>c(n,N_0)</math> are as above.

Difference between revisions of "Bayesian Parameter Estimation OldKiwi" - Rhea

Latest revision as of 12:21, 6 March 2008

Estimation of mean, given a known covariance

Estimation of covariance, given a known mean

Simultaneous estimation of unknown mean and covariance

Alumni Liaison